Check Point
PROGRESS SO FAR
FRAMEWORK SETUP
We have set up the graphics application framework on OpenGL and CUDA, including model importing, texture importing, mesh VBO/VAO/EBO binding, shader compiling and linking, camera setting, light setting, and OpenGL/CUDA interop. Since we started completely from scratch, it took more than what we expected to finish the preparation, as it requires communication between multiple libraries, laguages and APIs. A number of extra third-party libraries have been used for the starting framework:
   GLFW: context and input management    Assimp: model importing    SOIL: image/texture loading    GLM: 3D math library on CPU We’ve also found several benchmarks including the famous Sponza and San Miguel scene to test future results. Here is a comprehensive benchmark resources website:    http://graphics.cs.williams.edu/data/meshes.xmlScene Voxelization via Hardware Rasterizer
The first step of the voxel cone tracing based global illumination (VXGI) is to voxelize the scene in order to simplify the geometry representation. We have implemented an efficient voxelization procedure utilizing hardware rasterizer. The key is to ensure every triangle to be rasterized. This is done by first projecting every triangle along its dominant axis among its local x/y/z axis. In this way, every triangle will “face” the camera after the projection. Second, every triangle is rasterized by conservative rasterization, which guarantee a fragment to be generated as long as the triangle overlaps the area of fragment. With an orthographic camera and depth test disabled, we could efficiently voxelize the whole scene in one pass. Each voxel fragment then can store the “aggregate” surface information including (locally illuminated) color and normal distribution.
Sparse Octree Construction
The naïve voxelized scene can still take too much memory. Imagine a 512x512x512 3d texture with each voxel 4 byte. This could easily take 512 MB, not to say we need a lot more other information. Therefore, a much more compact hierarchical scene representation named sparse octree, as described in the original paper, is preferred. In order to better understand the method, we have prototyped a sequential version of sparse octree construction operation, and we are migrating the sequential version to GPU. We would prefer using CUDA to implement the structure instead of OpenGL because CUDA provides a more natural interface for non-graphics tree algorithm (although eventually we need to render the scene based on the structure). The communication between two APIs is fast thanks to CUDA/OpenGL interoperability.
Cone tracing on Sparse Octree
As the final part of computing global illumination, each visible surface fragment sends out several cones that approximately cover the surface hemisphere to traverse the sparse octree, which can be seen as a simplified and accelerated version of raytracing. We are still in the midway of figuring out the exact parallel traverse solution at this point. One issue we expect to encounter is load unbalancing since each cone will intersect with the different part of the tree hierarchy and may end up spending different amount of time. We plan to first not to implement the full version, but the ambient occlusion as it is simpler than global illumination, but still represents the spatial relationship between geometries.REVISED GOALS and DELIVERABLES
We are slightly behind our original plan as we should obtain a preliminary version of cone tracing by the end of last week. The first reason is that we have spent more time on environment setup, and the second reason is that we are slightly unclear on how to perform cone tracing on this special data structure in parallel.
In our original proposal, we planned to support both Windows and Linux. The work on compatibility turned out be tedious and is not really related to the topic. Therefore we have decided to give up supporting Linux.
We are still confident that we could finish the real-time global illumination solution that could support large scene on time. However, we would like to put support for dynamic objects as an optional feature.
REVISED SCHEDULE
- Nov.22 - Nov.24    Parallel octree construction on CUDA.    Lights information storing and filtering on sparse octree.
- Nov.25 - Nov.27    Compute ambient occlusion via parallel cone tracing as a preliminary result.    Understand the parallel traverse procedure.
- Nov.28 - Nov.30    Compute global illuminaiton via parallel cone tracing.
- Dec.1 - Dec.3    Testing and Optimization.
- Dec.4 - Dec.6    Create demo and control for an interactive demo.    (Optional) Dynamic objects support.
- Dec.7 - Dec.9     Benchmark and collect data.
- Dec.10 - Dec.13    Write final report and prepare for poster session.