If I had a nickel for every outrageous "matches/beats GPT-x" claim, I'd have mor...

qeternity · 2024-05-29T05:41:12 1716961272

> This absolutely is not the first Llama3 vision model. They even quote it's performance compared to Llava.

Although this is true, there have been earlier Llama3 based vision releases, none of the latest Llava releases are Llama3 based.

abbaselmas · 2024-05-29T09:02:11 1716973331

https://ollama.com/library/llava-llama3 llava-llama3

qeternity · 2024-05-29T12:37:17 1716986237

That is someone else who has just used the Llava name.

It is not by the original group who have published a series of models under the Llava name.

CGamesPlay · 2024-05-29T10:19:59 1716977999

This appears to be a Llava model which was then fine-tuned using outputs from Llama 3. If I understand correctly, that would make it Llama-2-based.

GaggiX · 2024-05-29T10:31:21 1716978681

>fine-tuned using outputs from Llama 3.

Llama 3 outputs text and can only see text, this is a vision model.

>that would make it Llama-2-based.

It's based on Llama 3, Llama 2 has nothing to do with it. They took Llama 3 Instruct and CLIP-ViT-Large-patch14-336, train the projection layer first and then later finetuned the Llama 3 checkpoint and train a LoRA for the ViT.

vixen99 · 2024-05-29T06:33:30 1716964410

All models surely write 'its performance'.