Hacker News new | past | comments | ask | show | jobs | submit login

Taking everything at face value: Dojo is overall a very impressive project!

- Communication speed is one of the biggest bottlenecks to large models, so their bandwidth of 4TBps is very smart.

- They claim a 1.3x perf/watt improvement, which is not really that great for ASICs compared to GPUs. Perf/watt is probably the most important number in datacenters.

- They only use SRAM, no DRAM. This is a huge mistake, which limits their model size. You can only fit a ~10GB model inside a single tile, versus 80GB models for a single A100 GPU.

- Software / compiler stack is as or more important than the hardware itself, because it dictates how much real performance you can squeeze out of the chips. I think Tesla will need to heavily focus on this area before getting anywhere close to real-world GPU performance.

Overall, I imagine the project will have similar pitfalls to Cerebras.




The thermal solution and packaging looks a lot like mainframe systems from the 80s (IBM, Siemens, Japanese).

Siemens H100: https://pbs.twimg.com/media/Ee56Q2bWkAA8vBE?format=jpg&name=... https://pbs.twimg.com/media/Ee56Q2ZX0AEqAWo?format=jpg&name=...

IBM: https://www.youtube.com/watch?v=xQ3oJlt4GrI


That was great, thanks for linking it.


Yeah, a lot of that probably doesn't matter for Tesla's current internal use case but all of it matters when you're talking about commercializing it.


They said they have DRAM on the sides where it also connects to PCI-E. I guess the chips in the middle won't be able to access it easily though.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: