Hacker News new | past | comments | ask | show | jobs | submit login

To stay fair, their "17B" model sits at 964GB on your disk and the 70B Llama 3 model sits at 141GB. unquantized GB numbers for both



Sorry, it sounds like you know a lot more than I do about this, and I'd appreciate it if you'd connect the dots. Is your comment a dig at either Snowflake or Llama? Where are you finding the unquantized size of Llama 3 70B? Isn't it extremely rare to do inference with large unquantized models?


to stay fairer, the required extra disk space for snowflake-arctic is cheaper then the required extra ram memory for llama3


For decent performance, you need to keep all the parameters on memory for both. Well, with a raid-0 of two PCIe 5 SSDs (or 4 PCIe 4) you might get 1 t/s loading experts from disk on snowflake-artic... but that is slooow.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: