Hacker News new | past | comments | ask | show | jobs | submit login

None of the current iOS and macOS LLM Apps use the Neural Engine. They use the CPU and the GPU.

nb: I'm the author of a fairly popular app in that category.




How would you know none of the apple apps use the neural engine? Is the key in the statement “LLM”?


Yes, I specifically meant autoregressive LLMs. BERT style encoder only models, ViTs and CNNs ran perfectly fine. Yesterday's coremltools update[1] changes that.

[1]: https://github.com/apple/coremltools/pull/2232


Why do they not?


AFAIK there is no general purpose, "do this on the ANE" API. You have to be using specific higher level APIs like CoreML or VisionKit in order for it to end up on the ANE.


This, plus metal acceleration works quite well. 7~8B parameter models quantized to 3bpw or so run with good tok/s on my iphone 15 pro


It works quite well as long as you don't care about battery.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: