CUDA Parallel Rasterizer is a GPU based parallel rasterization pipeline. The goal of this project is to implement a rasterization menchanism which inputs a mesh from an Wavefront OBJ file and display the object as a raster. The rasterization pipeline is implemented completely from scratch. This project uses no OpenGL API except for displaying the final PBO (No OpenGL APIs used in the rasterization pipeline).
Since the Parallelization is all for performance, a Stanford Dragon with over 50000 primitives runs at 45fps. A Utah Teapot with over 6000 primitives and a texture map of size 1600x1600 pixels also runs at 45fps. Included in this benchmark are phong shading model, back-face culling, support for interactivity and super sampled anti-aliasing of texture map.
Find my GitHub repo at: CUDA Parallel Rasterizer Github
Pre-Rasterization Pipeline Algorithms & Vertex Assembly
An OBJ Loader is used to read in mesh data.
The Buffer Objects (VBO, NBO, CBO, IBO etc) are created as float/int pointer arrays. The IBO is used for mapping the data to the primitives.
In the vertex shader, the vertices are multiplied by model-view-projection to transform the vertices from world space into screen space.
In the primitive assemble stage of the pipeline, the IBO is used to map vertices, normals, colors (and textures) to the respective triangles. Triangles are implemented as structs whose members are 3 vertices, a normal per vertex, a color per vertex and a UV coordinate per vertex. The implementation also require storing the original vertex coordinates in the primitives.
The rasterization stage of the pipeline is by far the most important and complex stage. I implemented a scanline algorithm to rasterize the mesh. The parallelization is based on primitives. Each primitve has its own kernel launch. The algorithm is as follows:
In the fragment shader I implemented a simple Phong Shading model. The algorithm parallelizes for each fragment / pixel.
The algorithm uses the world projection of the fragment calculated in the Rasterization step of the pipeline.
Render & Display
The render step simply writes the fragment color data to the the Pixel Buffer Object (PBO) which is then rendered on the display.
Texture Mapping with Texel Based Super-Sampled Anti-Aliasing
I implemented planar texture mapping as a part of the rasterization pipeline. The texture (Bitmap file) is read using EasyBMP Library. The UV coordinates are read from the vt values of the OBJ file. The anti-aliasing is a simple super sampling technique wherein each fragment samples the texel value from the texture map and also samples and averages its 8 neighboring texel colors.
A simple algorithm where the faces (primitives) not visible to the camera are not processed in the rasterization step. The condition is that if the dot product of the face normal and the direction from camera to the point (generally centeroid) on the primitve is positive, then the face can be culled. It is best to use a epsilon while checking for the values.
Interaction with Camera and Mesh
The camera is interactive using the mouse. The mesh can be translated rotated and scaled using the keyboard. One can also switch between the different shading models using the keyboard.