To stay fair, their "17B" model sits at 964GB on your disk and the 70B Llama 3 m...

CharlesW · 2024-04-24T19:37:27 1713987447

Sorry, it sounds like you know a lot more than I do about this, and I'd appreciate it if you'd connect the dots. Is your comment a dig at either Snowflake or Llama? Where are you finding the unquantized size of Llama 3 70B? Isn't it extremely rare to do inference with large unquantized models?

fsiefken · 2024-04-24T19:51:13 1713988273

to stay fairer, the required extra disk space for snowflake-arctic is cheaper then the required extra ram memory for llama3

Manabu-eo · 2024-04-25T18:29:32 1714069772

For decent performance, you need to keep all the parameters on memory for both. Well, with a raid-0 of two PCIe 5 SSDs (or 4 PCIe 4) you might get 1 t/s loading experts from disk on snowflake-artic... but that is slooow.