CUDA Rasterizer

Rasterization Pipeline implemented on GPU using CUDA

CUDA Parallel Rasterizer is a GPU based parallel rasterization pipeline. The goal of this project is to implement a rasterization menchanism which inputs a mesh from an Wavefront OBJ file and display the object as a raster. The rasterization pipeline is implemented completely from scratch. This project uses no OpenGL API except for displaying the final PBO (No OpenGL APIs used in the rasterization pipeline).
Since the Parallelization is all for performance, a Stanford Dragon with over 50000 primitives runs at 45fps. A Utah Teapot with over 6000 primitives and a texture map of size 1600x1600 pixels also runs at 45fps. Included in this benchmark are phong shading model, back-face culling, support for interactivity and super sampled anti-aliasing of texture map.

Find my GitHub repo at: CUDA Parallel Rasterizer Github


Rasterization Pipeline

This page describes the algorithms and techniques used in the project and the rendered results.

Pre-Rasterization Pipeline Algorithms & Vertex Assembly
An OBJ Loader is used to read in mesh data.
The Buffer Objects (VBO, NBO, CBO, IBO etc) are created as float/int pointer arrays. The IBO is used for mapping the data to the primitives.

Vertex Shader
In the vertex shader, the vertices are multiplied by model-view-projection to transform the vertices from world space into screen space.

Primitive Assembly
In the primitive assemble stage of the pipeline, the IBO is used to map vertices, normals, colors (and textures) to the respective triangles. Triangles are implemented as structs whose members are 3 vertices, a normal per vertex, a color per vertex and a UV coordinate per vertex. The implementation also require storing the original vertex coordinates in the primitives.

Rasterization
The rasterization stage of the pipeline is by far the most important and complex stage. I implemented a scanline algorithm to rasterize the mesh. The parallelization is based on primitives. Each primitve has its own kernel launch. The algorithm is as follows:

  • For each primitive, find the maximum and minimum coordinates of the triangle.
  • Iterate through the scanlines between the minimum and maximum y-coordinates.
  • Find the intersection of the scanline with the sides of the triangle. In most cases, there are 2 intersection points, except when only 1 vertex lies on the scanline.
  • Iterate through the x-coordinates between the 2 intersection points.
  • The X and Y coordinates represent the pixel coordinates on the screen.
  • Use Barycentric coordinates to compute the fragment values like position, normal and color (or texture).
  • Compute the depth test by finding the distance of the fragment in world space and if it is closer to the camera than the value in the depth buffer, then swap the values. (Use atomics to avoid races).
  • Fragment Shader
    In the fragment shader I implemented a simple Phong Shading model. The algorithm parallelizes for each fragment / pixel.
    The algorithm uses the world projection of the fragment calculated in the Rasterization step of the pipeline.

    Render & Display
    The render step simply writes the fragment color data to the the Pixel Buffer Object (PBO) which is then rendered on the display.

    Extra Features

    Texture Mapping with Texel Based Super-Sampled Anti-Aliasing
    I implemented planar texture mapping as a part of the rasterization pipeline. The texture (Bitmap file) is read using EasyBMP Library. The UV coordinates are read from the vt values of the OBJ file. The anti-aliasing is a simple super sampling technique wherein each fragment samples the texel value from the texture map and also samples and averages its 8 neighboring texel colors.

    Back-Face Culling
    A simple algorithm where the faces (primitives) not visible to the camera are not processed in the rasterization step. The condition is that if the dot product of the face normal and the direction from camera to the point (generally centeroid) on the primitve is positive, then the face can be culled. It is best to use a epsilon while checking for the values.

    Interaction with Camera and Mesh
    The camera is interactive using the mouse. The mesh can be translated rotated and scaled using the keyboard. One can also switch between the different shading models using the keyboard.

    Rasterizer Screen Capture Video

    Get in touch!

    Tweet, ping, email. Shoot me a message for anything covering my interests and I will surely respond!

    Find me on ...