It's not that you need an LLM for OCR but the fact that an LLM can do OCR (and h...

acheong08 · 2024-07-11T01:32:44.000000Z

GPT-4v is provided with OCR

simonw · 2024-07-11T02:33:25.000000Z

That's a common misconception.

Sometimes if you upload an image to ChatGPT and ask for OCR it will run Python code that executes Tesseract, but that's effectively a bug: GPT-4 vision works much better than that, and it will use GPT-4 vision if you tell it "don't use Python" or similar.

letmevoteplease · 2024-07-11T01:47:25.000000Z

No reason to believe that. Open source VLMs can do OCR.[1]

[1] https://huggingface.co/spaces/opencompass/open_vlm_leaderboa...