Hacker News new | past | comments | ask | show | jobs | submit login

If I had a nickel for every outrageous "matches/beats GPT-x" claim, I'd have more money than the capital these projects raise from VC.

This absolutely is not the first Llama3 vision model. They even quote it's performance compared to Llava. Hard to take anything they say seriously with such obviously false claims




> This absolutely is not the first Llama3 vision model. They even quote it's performance compared to Llava.

Although this is true, there have been earlier Llama3 based vision releases, none of the latest Llava releases are Llama3 based.



That is someone else who has just used the Llava name.

It is not by the original group who have published a series of models under the Llava name.


This appears to be a Llava model which was then fine-tuned using outputs from Llama 3. If I understand correctly, that would make it Llama-2-based.


>fine-tuned using outputs from Llama 3.

Llama 3 outputs text and can only see text, this is a vision model.

>that would make it Llama-2-based.

It's based on Llama 3, Llama 2 has nothing to do with it. They took Llama 3 Instruct and CLIP-ViT-Large-patch14-336, train the projection layer first and then later finetuned the Llama 3 checkpoint and train a LoRA for the ViT.


All models surely write 'its performance'.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: