Hacker News new | past | comments | ask | show | jobs | submit login

Jenson could write one of the clouds a license to use 4090s in a DC and make this crunch disappear overnight (would be rough for gamers though)



4090s have 24GB of 384-bit-wide GDDR6 with no ability to interconnect that memory to other 4090s except thru PCIe bandwidth.

H100s have 80GB of 5120-bit HBM with SXM NVLink for 8-at-a-time in a rack.

HUGE difference in bandwidth when doing anything where the inferring the model needs to be spread over multiple GPUs, which all LLM's are. And even more of a difference when training is in play.


There's A6000 Ada for that (you can rent servers with 4xA6000 at Lambda Labs). Moreover, 4090 has only 24GB memory, H100 has 80GB.


4090 (and all consumer chips of its class) have terrible efficiency and are not suitable for use in a DC.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: