Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
gliptic
on April 1, 2023
|
parent
|
context
|
favorite
| on:
Llama.cpp 30B runs with only 6GB of RAM now
The size mentioned is already quantized (and to integers, not floats). mmap obviously doesn't do any quantization.
Consider applying for YC's W25 batch! Applications are open till Nov 12.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: