\*EDIT\*: Nevermind, llamafile hasn't been updated in a full month and gemma sup...

*EDIT*: Nevermind, llamafile hasn't been updated in a full month and gemma support was only added to llama.cpp on the 21st of this month. Disregard this post for now and come back when mozilla updates llamafile.

---

llama.cpp has integrated gemma support. So you can use llamafile for this. It is a standalone executable that is portable across most popular OSes.

https://github.com/Mozilla-Ocho/llamafile/releases

So, download the executable from the releases page under assets. You want either just main and server and llava. Don't get the huge ones with the model inlined in the file. The executable is about 30MB in size,

https://github.com/Mozilla-Ocho/llamafile/releases/download/...