Hacker News new | past | comments | ask | show | jobs | submit login

4090s have 24GB of 384-bit-wide GDDR6 with no ability to interconnect that memory to other 4090s except thru PCIe bandwidth.

H100s have 80GB of 5120-bit HBM with SXM NVLink for 8-at-a-time in a rack.

HUGE difference in bandwidth when doing anything where the inferring the model needs to be spread over multiple GPUs, which all LLM's are. And even more of a difference when training is in play.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: