Hacker News new | past | comments | ask | show | jobs | submit login

AMD been executing well on multiple fronts. One big advantage is that AMD divorced themselves from their fab (now global foundries). So when a particular fab fails at a shrink they can just switch. In the 7nm case that means moving from global foundries to TSMC.

Additionally they were first to hypertransport (Intel followed with QPI) and knocked it out of the park with chiplets. Having two dies per package was pretty common, even way back at the 66 MHz pentium pro. But HT allows AMD quite a bit of flexibility. They can switch lanes between hypertransport (now updated and called Infinity Fabric) between pci-e and IF.

The this helps them on multiple fronts. It decouples the CPU and memory interface and pci-e standards. It raises their yield by using smaller chips. It also allows them to use different processes for different chips, so now the I/O die can use an older process.

One big impact is now only does the yield increase, but also the increased number of chips lets AMD amortize their R&D over more dies, and also customize for different produces without having to spend the extra R&D on numerous different dies. The low end desktop/laptops get 2 chips (1 cpu + 1 I/O). Higher end chips get 3 chips (2 cpu + 1 I/O). The high end servers get 9 (8 cpu and 1 I/O). So they can go from under 65 watts and under $200 to over 250 watts and over $5,000 all based on the same chiplets.

The AMD first generation Epyc did expose some weaknesses in OS/Applications that didn't like the high variations in latency to main memory and I/O. 4 chips inside a single socket had their own pair of memory channels and would have to use Hypertransport/IF to get to the other 6 memory channels. The design was reasonable, but many apps were NUMA aware and ran poorly.

In the second generation AMD moved all memory channels to the I/O chiplet and now the socket is a single NUMA domain and all chiplets see identical latency. The NUMA tweaks, 10-15% improvement in IPC, big improvement on the floating point side, and 1.5 x more cores (depending on the model) means that for many real world codes that AMD Epyc generation 2 chips (rome) are twice as fast at real world codes as the previous generation, which Intel is still trying to match. Meanwhile Intel is still trying to get a Xeon shrink they promised in 2017 working.

So generally ability to switch fabs and how well HT/IF works with chiplets caught Intel at a really bad time and for once AMD seems to be actually executing well and producing non-trivial volumes into multiple market segments.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: