How is it compared to 7B LLaMA quantized to run on a raspberry pi?

regularfry · 2024-01-06T13:02:26 1704546146

Probably similar token rates out of the box, although I havent done a straight comparison. Where they'll differ is in the sorts of questions they're good at. Llama2 was trained (broadly speaking) for knowledge, Phi-2 for reasoning. And bear in mind that you can quantise phi-2 down too. The starting point is f16.

jasonjmcghee · 2024-01-07T04:19:43 1704601183

If you can run quantized 7B, nothing beats mistral and its fine tunes- like openhermes2.5