Hacker News new | past | comments | ask | show | jobs | submit login

Shouldn't CogAgent be in this comparison?



CogVLM should be, not sure how CogAgent plays into this. This isn't an agent.


You would use CogAgent in VQA mode. Why would someone downvote suggesting to test one of the most powerful multimodal LLMs? Because it doesn't have "V" in its name? CogAgent is improved on many tasks compared to CogVLM.


I didn't downvote, only replied.

CogAgent is also CogVLM modified to handle documents and larger images. CogVLM is better for VQA.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: