Hacker News new | past | comments | ask | show | jobs | submit login

Llama.cpp and you can download one of the quantized models directly from "thebloke" on HF. I can't 100% vouch for it because I have no idea how it builds under linux on apple silicon, I'd be very interested to know if there are any issues and how well it uses the processor.

https://github.com/ggerganov/llama.cpp https://huggingface.co/TheBloke

You should be able to at least run the 7B and probably the 13B.

For reference, I can run the 7B just fine on my 2021 Lenovo laptop with 16GB ram (and ubuntu 20.04)




Thanks, yes I've seen someone else mention trying Llama.cpp. I'll see if I can set it up, I'm new to this and will look for a guide on how to use Llama.cpp and report back as if it builds and runs well on Apple Silicon. I think it would be a nice write up for the community as there isn't too much out there about Linux on AS in general.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: