Hacker News new | past | comments | ask | show | jobs | submit login

Could it run on a 4x 3090 24GB rig?

These can be built for about $4500 or less all-in.

Inference FLOPs will be roughly equivalent to ~1.8X A100 perf.




You could run it on a single high end GPU. I can run llama2's models ,(except 70b) on my 4080.


This can run on 1x 2060S 8 GB


With what degree of quantization?


None, just the default weights using ollama. It's fast too. 13b is where things get slow


does a 4x 3090 rig need nvswitch




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: