You can run a 7B model on CPU relatively quickly. If you want to go faster, the ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

speedgoose on Sept 30, 2023 | parent | context | favorite | on: Mistral releases ‘unmoderated’ chatbot via torrent

You can run a 7B model on CPU relatively quickly. If you want to go faster, the best value in public clouds may be a rented Mac mini.

objektif on Sept 30, 2023 [–]

Do you have any resources to read on how to host LLMs in general? I am looking for scaleable ways to host our own models. Thanks.

speedgoose on Oct 1, 2023 | [–]

Sorry I haven’t followed the latest developments to run at scale since the summer. I don’t have concurrent users so llama.cpp or diffusers are good enough for me.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact