Jenson could write one of the clouds a license to use 4090s in a DC and make thi...

mk_stjames · on Aug 1, 2023

4090s have 24GB of 384-bit-wide GDDR6 with no ability to interconnect that memory to other 4090s except thru PCIe bandwidth.

H100s have 80GB of 5120-bit HBM with SXM NVLink for 8-at-a-time in a rack.

HUGE difference in bandwidth when doing anything where the inferring the model needs to be spread over multiple GPUs, which all LLM's are. And even more of a difference when training is in play.

treprinum · on Aug 1, 2023

There's A6000 Ada for that (you can rent servers with 4xA6000 at Lambda Labs). Moreover, 4090 has only 24GB memory, H100 has 80GB.

paxys · on Aug 1, 2023

4090 (and all consumer chips of its class) have terrible efficiency and are not suitable for use in a DC.