Hacker News new | past | comments | ask | show | jobs | submit login

That's 10x speed increase. What's the secret behind apple M3? Faster clocked RAMs? Specific AI hardware?



Unified memory and optimizations in llama.cpp (which Ollama wraps).


Is that using the GPU?


It can be variably configured. There are details in the repo, but llama.cpp makes use of Metal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: