For completeness, there's also another paper that demonstrated you get more powe... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

Taek on March 13, 2023 | parent | context | favorite | on: Dalai: Automatically install, run, and play with L...

For completeness, there's also another paper that demonstrated you get more power/accuracy per-bit at 4 bits than at any other level of precision (including 2 bits and 3 bits)

MacsHeadroom on March 14, 2023 [–]

That's the paper I referenced. But newer research is already challenging it.

'Int-4 llama is not enough [0] - Int-3 and beyond' suggests 3-bit is best for models larger than ~10B parameters when combining binning and GPTQ.

[0] https://nolanoorg.substack.com/p/int-4-llama-is-not-enough-i...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact