They said the smallest PaLM 2 can run locally on a Pixel Smartphone. There's no ...

espadrine · on May 10, 2023

I am talking about the 3 larger models PaLM 2-S, PaLM 2-M, and PaLM 2-L described in the technical report.

At I/O, I think they were referencing the scaling law experiments: there are four of them, just like the number of PaLM 2 codenames they cited at I/O (Gecko, Otter, Bison, and Unicorn). The largest of those smaller-scale models is 14.7B, which is too big for a phone too. The smallest is 1B, which can fit in 512MB of RAM with GPTQ4-style quantization.

Either that, or Gecko is the smaller scaling experiment, and Otter is PaLM 2-S.

MacsHeadroom · on May 10, 2023

My Pixel 6 Pro has 12GB of RAM and LLaMA-13B only uses 9GB in 4bit.