Hacker News new | past | comments | ask | show | jobs | submit login

4-bit is fine for models of all sizes, in my experience.

The only reason I personally don’t quantize tiny models very much is because I don’t have to, not because the accuracy gains from running at 8-bit or fp16 are that great. I tried out 4-bit Phi-3 yesterday, and it was just fine.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: