Basically how would one use a GPU to compute something without CUDA/OpenCL mostly for curiosity but say I wanted to use a PS2s graphics chip to perform a calculation. How was/is that actually done?
PS2 was before the GPGPU age and its GPU is fixed functionality, so it's not suitable for general calculations. Your problem would have to align perfectly with the computations done for regular rendering.
If we instead look at something more modern like an early GeForce with programmable shaders what was typically done was to set up your regular rendering of a full screen quad and write a pixel shader which performs your desired computations instead of actual lighting calculations.
APIs such as CUDA and OpenCL were introduced to allow people who weren't making games to make GPU calculations without pretending that they were colour calculations.
The PS2 was actually kind of ahead of its time in this regard, with programmable vector units right there in the middle of the rendering pipeline; you could actually write the equivalent of vertex shaders and even simple geometry shaders *long* before those were available on standard home GPUs. (although you had to code them in assembly)
I knew there were programmable vector units and fixed functionality for fragments. Seeing other responses here I feel I underestimated how flexible it was, not having worked with it myself.
TL;DR yes and no (thanks to DSPs, technically some were an integral part of the GPU):
In the era of semi-programmable graphics pipelines, of which there was the N64, PS2 and GameCube, you would not be able to solve general-purpose problems using only that pipeline: given a handful of colour mixing formulas, registers, textures and vertices to control the pixels to write back to memory, I fail to come up with any tangible problem you could hack them into solving.
However, what the 3D generation of video game consoles had to bundle with the hardware in order to process complex scenes at interactive frame rates were digital signal processors (DSPs). Think of them as regular processors, but with an extra vector unit, allowing you to crunch enormous amounts of data in parallel (single instruction, multiple data instructions, or SIMD).
This was essential to perform geometry transformations, which benefit greatly from parallelism, in order to build graphics commands to pass down to the rasteriser.
I am unsure if the SEGA Saturn and PS1's vector-accelerated instructions are sufficiently general-purpose to allow them perform any computation you wish to do, but at least starting with the Nintendo 64, you could write microcode to accelerate any task you'd like, provided you were brave enough to deal with the harsh memory constraints of microcode and obscure documentation.
To give concrete examples, the PS1 came with a dedicated chip (the MDEC, or motion decoder) to decode MPEG video frames from the CD directly into the VRAM to display on screen.
While the N64 doesn't have dedicated video decoding chips, its signal processor was designed in such a way to allow microcode to do an equivalent job (not to mention YUV texture decoding on the rasteriser's side).
So this did not stop a talented Resident Evil 2 developer fresh out of school to write an MPEG video decoder microcode to be able to cram two CDs worth of video data (around 1 gigabyte, down to a 64 megabyte cartridge), of course with other smart decisions like data deduplication and further compression/interlacing.
Another one was Rare developers writing an MP3 audio decoder microcode to store large amounts of dialogue.
And finally, in recent times, a developer managed to write an H.264 video decoder microcode, for a machine that was released almost a decade prior to this codec's birth.
You can also accelerate physics and anything else that would benefit from SIMD, really, and in fact, while more modern CPUs integrate SIMD instructions in their tool chest that compilers can take advantage of, the PS3's Cell processor was a brief throwback to the old hardcore ways, before GPGPU became king.
You could almost treat the DSPs as the compute units of modern GPU architectures: they definitely processed the vertices, and nothing stops you from adding a "vertex shader" pass, however, because it's not directly integrated into the graphics pipeline, it's harder to emulate a true pixel shader, you might be limited to full-screen pixel shading since there's no feedback from the rasteriser about which pixels exactly get written to memory.
> You could almost treat the DSPs as the compute units of modern GPU architectures: they definitely processed the vertices, and nothing stops you from adding a "vertex shader" pass, however, because it's not directly integrated into the graphics pipeline, it's harder to emulate a true pixel shader, you might be limited to full-screen pixel shading since there's no feedback from the rasteriser about which pixels exactly get written to memory.
Actually, you could probably just use the depth buffer as write-only with constant values to identify objects as different kinds, and use a "pixel shader" pass to apply effects on those annotated pixels, but I assume this might be a pretty expensive technique.
... Now that I think about it, it's not unlike modern rendering engines.
Computer graphics has a very interesting history, and like many things, there's a lot to learn from studying it. Surprisingly, many techniques and principles are still relevant to this day.
Thankfully, the Internet is full of documentation, post-mortem accounts and reference implementations to learn more about all this.
Here's a bunch of random noteworthy things and references I forgot to link (sorry for potentially digging rabbit holes):
The best information tends to come from official documentation detailing almost everything you want to know about the inner workings of systems. Additionally, unofficial community documentation can also be of great quality and complement official sources.
The architecture overview posts from Rodrigo Copetti (https://www.copetti.org/writings/consoles/) pack a lot of accurate information at a glance, and are a great starting point if a specific topic piques your interest.
Emulators' progress reports can reveal a lot about the details and intricacies required for accurately replicating these systems' features: https://dolphin-emu.org/blog/ is one such amazing source of information, but all other major emulation efforts have equally interesting content.
Transparency sorting of surfaces (https://en.wikipedia.org/wiki/Order-independent_transparency) is a hard problem to solve for traditional scanline renderers (most PC/console GPUs), to the point that even today's releases can ship with somewhat obvious rendering errors.
On the other hand, tiled renderers (used in the Dreamcast, mobile hardware, and Apple silicon) are able to solve this problem due to their very nature, albeit trading another set of drawbacks, though the completely different hardware approach is a nice read (https://en.wikipedia.org/wiki/Tiled_rendering).
Someone once shared a video demonstrating a 3D package from ancient Lisp machines (https://youtu.be/gV5obrYaogU?t=30), and it was almost shocking to see how many things were familiar and done right from a modern perspective.
Reimplementing new techniques on old hardware. For example, someone implemented real-time shadows and toon shading on N64 (https://youtu.be/VqDAxcWnq3g).
For fun, you can grab RenderDoc (https://renderdoc.org/) and copies of your favourite games to analyse their frames (even via emulation): see how developers implement or fake visual effects and generally how these games render their world.
For instance, Dolphin emulates the GameCube's semi-programmable texture environment unit (TEV) via a pixel shader, and its shader source code is directly visible and editable, with the resulting output shown in the texture viewer. With the aid of textures, you can implement refraction, reflection and heat haze, among other effects.
Thanks for the massive reply. Lots of cool stuff, and a few I didn't know about!
* I use RenderDoc a lot both for fun and at work.
* I've read the Dolphin blog every now and then.
* I love Rodrigo Copetti's website and remember reading everything it had about Nintendo consoles when I first found the site.
* I'm intimately familiar with OIT and used the AMD slides about OIT with linked lists as inspiration for the particle simulation I wrote as my Master's thesis.
* I'm familiar with tiled rendering and have worked with hardware which uses it
All the other stuff you shared is new to me and I will definitely be digging into it!
The write up of the RE2 MPEG decoding was thrilling (even though I'd seen MVGs coverage before).
But the H264 decoding is just surreal. Imagine if someone whipped that out in 1996.
Is this what they mean when they say sufficiently advanced technology is indistinguishable from magic? :)
I left the server after 2 years or something because I couldn't stand the abysmal signal:noise ratio (of many kinds), and that's as someone who dedicated their life to CG. Just my 2c.
I get what you mean. I sometimes enjoy looking at the showcase channel or checking out some of the technical discussions but I quickly feel I need a break of a few months from it because there's just so much going on and a lot of it is silly.
The title is likely a nod to Jim Blinn's highly influential book (more like a collection of articles), A Trip Down The Graphics Pipeline[1]. That was the first book I read on 3D graphics that helped me to actually intuitively understand fundamental 3D graphics concepts.
Why does Graphics APIs especially 3D, still remain very much low level? I think 3d graphics is still not as ubiquitous as it could be perhaps due to complex and low level APIs.
It’s not just remaining low level, the trend has been moving steadily toward lower level. The primary reason is performance. It’s hard to squeeze the best performance out of the high level APIs. Another reason is because the people who care the most and get involved in the API and standards committees are people who know the APIs and their limitations inside and out, and care deeply about performance; people like game engine render devs, browser developers, OS developers, and hardware developers. And so newcomers and the ease of learning these APIs aren’t well represented. On one hand, there’s a valid argument to be made that people can write wrapper libraries if they want easy-to-use libraries. On the other hand, some of us really miss how easy and fun using SGI’s GL library was before it turned into OpenGL.
Earlier versions of GL were indeed a lot of fun and easy to get started without much learning curve, but very easy to get into a pickle and blow your foot once the project gets reasonably complex with everything being a state in the context.
OpenGL 4 was a huge shift and felt like such an improvement once you got used to it. It was a bit more work to set things up, but it was so much faster and had a lot less messy global state.
Products in the game space heavily differentiate from one another by "showing more" data (textures, models, detail, etc) or "doing more" with data - (massively multiplayer, more physics, ai, sound) or "doing it faster" - (more fps, low latency, etc)
Local compute resources are finite, and if you want to target the widest market, then you need to fully utilize mainstream hardware; which means anyone willing to do the grunt work benefits from a higher quality product.
Anyone that wants to explore graphics programming should start with middleware, 20 years ago we would use something like Open Inventor or VTK for that purpose.
No need to always start by learning on how to render a triangle or a rotating cube from scratch.
Because the need for higher-level, easy to use environment is served well by middleware, the so-called "game engines", Unity and Unreal.
There's intermediate level APIs, like the old OpenGL 4 and DirectX 11, or slightly more modern portable alternatives like bgfx and sokol; on the web, there's three.js.
The subject field is complex, the applications are also primarily complex, so there's not much value in something that lets you "easily" drop a few cubes on an endless plain (like the VRML of old days). There's no 3D equivalent of "I'll make myself the web site for my barbershop" that started the Web 20 years ago. (or, to be more pedantic, "the page for my high-energy physics experiment" 30 years ago.) Nowadays even the Web is hellishly complicated. Maybe if VR/AR takes off (any moment now), there will be need for "simple 3D", but I suspect it will be handled by custom "easy" versions of Unity/Unreal rather than people writing to a "simpler 3D API". The way people aren't making barbershop sites in "simple HTML", but going to Squarespace.
Classic OpenGL is anything but low level, and quite fun to learn. But it turns out it's a poor model for modern hardware (as opposed to early 90s SGI graphics workstations with architectures very different from modern GPUs), which resulted in impedance mismatches that made writing performant drivers (and performant client code) quite difficult. So modern graphics APIs went down an abstraction layer or two to make it easier to write code that corresponds to how GPUs actually work. But programming against something like raw Vulkan requires a vast amount of boilerplate code (that used to be handled by the driver) to get anything on screen, so a reasonable middle ground probably looks something like OpenGL 4.
I've seen this still be recommended in 2021. I personally read it last year and
it was definitively helpful.
Just excellent if you want to understand what is going on under the hood. Which is necessary if you are writing anything more than a toy graphic program, or understand the output of GPU debugging/profiling tools from Nvidia and AMD.
OptiX dev here. Ray tracing is more or less completely separate from the raster pipeline, which is what this article is discussing. There’s some conceptual overlap when it comes to shading, and both pipelines share compute resources, but the “pipeline” part doesn’t overlap much (and personally I think the reasons why are interesting). The RTX cards have a separate ray tracing core in addition to the raster processing hardware, so both pipelines still exist. What this means is Fabien’s article is still absolutely relevant to the raster pipeline, it’s just missing information about today’s ray tracing pipeline.
RT is not a different step. Just like compute shaders are not a different step in the graphics pipeline but they're an independent thing. RT dispatches are their own concept and have their own rules.
It's very relevant. The parts shown here haven't changed. The biggest things we've gotten is just more control about command submission and dependencies etc. Otherwise the rasterization pipeline still chucks along exactly as shown here. (Yeah you might want to use mesh shaders, but they're not always supported. And there is RT stuff too. But what's in this article also works fine)
Agreed, but don’t forget about compute shaders: their programmer audience extends way beyond 3D, whereas RT with all it’s current limitations is only interesting for AAA game studios.
Reminds me of when I first started learning about GPUs. They've come a long way. I think back then it was basically rasterization and culling. It's been a long time, though.
As an aside, I mentioned this series earlier today on the Graphics Programming Discord server and now it's on the front page of HN?
edit: on the other hand, searching the server it seems this article is linked there every few days, so it might just be coincidence