Qwen2.5 has a 32B release, and quantised at q5_k_m it \*just about" completely f... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

regularfry 62 days ago | parent | context | favorite | on: Llama 3.2: Revolutionizing edge AI and vision with...

Qwen2.5 has a 32B release, and quantised at q5_k_m it *just about" completely fills a 4090.

It's a good model, too.

kristianp 62 days ago [–]

Do you also need space for context on the card to get decent speed though?

regularfry 61 days ago | [–]

Depends how much you need. Dropping to q4_k_m gives you 3GB back if that makes the difference.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact