Hacker News new | past | comments | ask | show | jobs | submit login

It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: