Seems model sizing, compression and quantization are still an art form, see also https://www.unum.cloud/blog/2023-02-20-efficient-multimodali...
Seems model sizing, compression and quantization are still an art form, see also https://www.unum.cloud/blog/2023-02-20-efficient-multimodali...