Hacker News new | past | comments | ask | show | jobs | submit login

The Internet Archive's OCR is built around tesseract nowadays, but you're right about piggybacking off their pipeline. Upload a text to archive.org and get hOCR for free.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: