Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
regularfry
62 days ago
|
parent
|
context
|
favorite
| on:
Llama 3.2: Revolutionizing edge AI and vision with...
Qwen2.5 has a 32B release, and quantised at q5_k_m it *just about" completely fills a 4090.
It's a good model, too.
kristianp
62 days ago
[–]
Do you also need space for context on the card to get decent speed though?
regularfry
61 days ago
|
parent
[–]
Depends how much you need. Dropping to q4_k_m gives you 3GB back if that makes the difference.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
It's a good model, too.