Hacker News new | past | comments | ask | show | jobs | submit login

What quantization were you using? I've been getting some weird results with 34b quantized to 4 bits -- glitching, dropped tokens, generating Java rather than Python as requested. But 7b, even at 4 bits, works OK. Posted about it earlier on this evening: https://www.gilesthomas.com/2024/02/llm-quantisation-weirdne...



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: