It's really great to see these kind of articles! Of course, this is just scratch...

It's really great to see these kind of articles! Of course, this is just scratching the surface. I think the most challenging bit about understanding GPUs is breaking through the marketing claims and trying to understand what is really going on in the background. How are the instructions scheduled, how does the execution actually look like and so on. And this is where information can be very difficult to come by. Nvidia documents the "front-end" of their GPUs very well, but the details are often shrouded in mystery. It's easier with AMD, since they publish the ISA, but it can be still difficult to map their marketing claims (like 2x compute on RDNA3) to the actual hardware reality. Interestingly enough, we know quite a lot about Apple GPUs, since their compute architecture is very streamlined compared to other vendors.