Apologies. The PDFs that we deal with are digital-native, but do not have embedd...

iplaw on Oct 13, 2016 | parent | context | favorite | on: Show HN: Tesseract.js – Pure JavaScript OCR for 60...

Apologies. The PDFs that we deal with are digital-native, but do not have embedded text and are not searchable. I simply want to OCR the PDF and spit the text into a Word/text file.

I don't even care about perfect formatting, that's easy to fix. I do care about perfect OCR. That's crucial.