Previous | Next --- Slide 20 of 28
Back to Lecture Thumbnails
dchen1

What is so different about the calculations of ray tracing such regular GPUs can't handle it well? I would assume that this kind of very parallel computation with mostly vectors (i think) would be well suited for traditional GPUs.

emmaloool

@dchen1 yeah GPUs are well-suited to parallel computation, and it usually does this through SIMD (single-instruction multiple data) programming, which is what you're thinking about when it comes to vectors. Work is executed as shaders, and dispatched by groups of threads simultaneously.

Since the shift from using the GPU as solely a fixed-purpose graphics processor to a unit that can perform general-purpose computation (GPGPU), in recent years, researchers and companies have been interested in improving performance by optimizing compute and specialized graphics functionality. So there's this shift towards figuring out how we can accelerate applications that are extremely intensive, like ray-tracing, even on parallel platforms like the GPU. There are factors in specialized applications that can be improved in their special cases - like Keenan mentioned, cache coherence and in general cache utility plays a role, and we try to optimize this by specialized batch scheduling. This is why we put accelerators and dedicated distinct pipelines (like this example, the RTX pipeline on NVIDIA GPUs).

In the end it comes down to engineers weighing different design decisions to modify the architecture. I found a list here for NVIDIA's RTX support, which you can read to see why they decided to support RTX: https://developer.nvidia.com/blog/introduction-nvidia-rtx-directx-ray-tracing/

keenan

@dchen You can often make things faster by specializing the hardware for a certain task. Yes, you can write ray tracing code in a general-purpose GPU language like CUDA or OpenCL (and in fact there's a path in NVIDIA's Optix renderer that does just that). Doing so will give you a very fast ray tracer. Actually baking key operations into the hardware will make it even faster. :-)