I find this related page more interesting: A Breakdown of AI Chip Companies http...

buildbot · on April 5, 2022

I read this article vs. the actually posted one, it has a lot of good points but also get a fair amount wrong, and the correction on the power usage is just the tip of the iceberg. That's why WaferScale do what they do, the power cost of off die vs. on die is massive, and enabling everything to be on chip means you can feed the design more easily and with less power.

For example, Nvidia's compute and consumer GPU line diverged a long time ago. Modern A100s have literally only one SM capable of doing normal GPU tasks, probably to support running a display on whatever Quadro version they end up increasing. They diverged in really specific ways, for example the P100 has hardware scheduling, where as the 1080 does not (in the same way at least).

Another issue is the author spends a long time talking about how important software and ecosystem is, then completely misses that point when talking about their own CHIP - just because it is RISCV and compilers exist for that arch does not equal CUDA. Also, big re-order buffers cost area and heat that could be spent on more SMs. That's why in order to beat Nvidia you must get more specialized, they've picked their niche on the CPU-GPU-ASIC continuum, beating them at the same process node requires ditching some stuff of the stuff an Nvidia GPU. Which is why they've been specializing their arch with tensor cores.

It just also turned out those are useful for gaming with deep learning to upres the graphics, as that's easy to accelerate than driving quadraticlly higher resolutions.

rrss · on April 5, 2022

> for example the P100 has hardware scheduling, where as the 1080 does not (in the same way at least).

what does hardware scheduling mean in this context?

gary_0 · on April 5, 2022

The detailed follow-up to that post is here: https://geohot.github.io/blog/jekyll/update/2021/12/12/a-cor...

modeless · on April 5, 2022

Interesting! I think Cerebras is exciting too, the problem is that it's so expensive that there will never be a software ecosystem for it. The people who would develop it will never have access to one.

gary_0 · on April 5, 2022

Yep, it's hard to see how Cerebras will ever have much of an ecosystem, since hardware makers are traditionally not very good at building and maintaining one by themselves. But they're probably aiming at very specialized customers only.

Geohot is right about the AI accelerator market's problems, and a competitive 4-digit-dollars device is a great idea even if his initial strategy was way off. Although you could say almost the same thing about the high-performance CPU and GPU duopolies (Apple's chips don't count due to their proprietary OS lock-in, although I wish Asahi Linux luck at fixing that).

[edit] And beyond that, you have TSMC dominating the next-gen fab market, too.