> DDR5 DIMMS directly That's the problem. Good DDR5 RAM's memory speed is <100GB...

brucethemoose2 · on March 21, 2023

Not if the bus is wide enough :P. EPYC Genoa is already ~450GB/s, and the M2 max is 400GB/s.

Anyway, what I was implying is that simply fitting a trillion parameter model into a single pool is probably more efficient than splitting it up over a power hungry interconnect. Bandwidth is much lower, but latency is also slower, you are shuffling much less data around.