Hacker News new | past | comments | ask | show | jobs | submit login

RT cores are different because raytracing wants AoS (array of structures) rather than SoA (structure of arrays).

Let's look at the ALU perspective. Normal shader cores are essentially SoA: all ALU operations operate on 32 (NVidia) or 64 (AMD) items / threads at a time.

Implementing a ray-box intersection requires 6 multiply-adds to determine the intersection-time of the ray with each box plane, plus a bunch of comparisons to determine whether and when you hit the box. So if you're walking a standard (binary) BVH, you need 12+x ALU instructions (roughly equal to cycles) to handle one step of a wave / warp.

The picture is still fairly rosy when you start out your BVH walk, but then you get ray divergence. Some of your rays may finish early, some rays may want to do ray-triangle intersection instead. This means that only some of the SIMD lanes will be active and your ALU utilization drops. You're using the same number of cycles, but get much lower bang / buck.

In a dedicated RT core, you can operate one ray at a time instead of one instruction at a time. So you can do all multiply-adds for intersecting a single ray with both boxes in your BVH node in a single cycle, and then follow up with the comparisons in the remainder of your pipeline.

The upshot is that when rays diverge, you can still fully utilize the ALU units in your ray-box intersection pipeline -- it simply takes you fewer cycles to process all rays in a warp.

A similar argument applies to the memory system as well -- due to ray divergence, you obviously want to store your BVH nodes as AoS. A BVH node requires 12 floats to store the dimensions of two boxes, plus some space for child node links, which makes 64 bytes a natural node structure size, and you want to keep it contiguously in memory so that loading one node means loading (part of) one cacheline. But this makes it difficult to get the data through a normal shader core's load unit, which is optimized for SoA.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: