Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, in the very general case, chip errors are a function of die area. Cutting a die into four pieces so that when an error occurs in manufacturing, you only throw out a quarter of the die area is becoming the right model for a lot of designs.

Like all things chips, it's way more complicated than that, fractally, as you start digging in. Like AMD started down this road initially because of their contractual agreements with GloFlo to keep shipping with GloFlo does, but wanted the bulk of the logic on a smaller node than GloFo could provide, hence the IO die and compute chiplets model that still exists in Zen. It's still a good idea for other reasons but they lucked out a bit by being forced in that direction before other major fabless companies.

This is also not a new idea, but sort of ebbs and flows with the economics of the chip market. See the VAX 9000 multi chip modules for an 80s take on the same ideas and economic pressures.




Their GPUs are likely to be multichip for the first time too with NAVI 31 (while Nvidia's next gen will still be single chip and likely fall behind AMD). It also seems like that the cache will be 6nm while the logic will be 5nm and bonded together with some new TSMC technology. At least that can be inferred from some leaks:

https://www.tweaktown.com/news/84418/amd-rdna-3-gpu-engineer...


I've yet to see any sort of research out of AMD on MCM mitigations for things like cache coherency and NUMA. Nvidia on the other hand has published papers as far back as 2017 on the subject. On top of that even the M1 Ultra has some rough scaling spots in certain workloads and Apple is by far ahead of everyone else on the chiplet curve (if you don't believe me, try testing lock-free atomic load/store latency across CCX's in Zen3).

Also AMD claimed the MI250X is "multichip" but it presents itself as 2 GPUs to the OS and the interconnect is worse than NVLink.


There's a few ways to interpret that. Another interpretation could be that they are simply taping out Navi32 on two nodes, perhaps for AMD to better utilize the 5nm slots they have access to. Perhaps when Nvidia is on Samsung 10nm+++, then the large consumer AMD GPUs get a node advantage already being at TSMC 7nm+++, and so they're only using 5nm slots for places like integrated GPUs and data center parts that care about perf/watt.

But your interpretation is equally valid with the information we have AFAICT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: