Hacker News new | past | comments | ask | show | jobs | submit login

GPT-4v is provided with OCR



That's a common misconception.

Sometimes if you upload an image to ChatGPT and ask for OCR it will run Python code that executes Tesseract, but that's effectively a bug: GPT-4 vision works much better than that, and it will use GPT-4 vision if you tell it "don't use Python" or similar.


No reason to believe that. Open source VLMs can do OCR.[1]

[1] https://huggingface.co/spaces/opencompass/open_vlm_leaderboa...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: