Hacker News new | past | comments | ask | show | jobs | submit login

As far as I understand, some inference is done on-device. LLMs and diffusion just changed the field in the last year and it takes time for hardware to catch up (+ work to reduce model sizes). So, it's just hard to run the latest models on-device. So you either end up doing it online (Google's preference) or having weaker models (Apple's preference).



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: