Do anyone know any existing effort on converting these scanned image to text cor... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

blackcat201 on Aug 11, 2021 | parent | context | favorite | on: The British Library puts 1M newspaper pages online...

Do anyone know any existing effort on converting these scanned image to text corpus ( probably a new OCR model needed to be developed on these old text ) ? I think it would be more usable if they are in text form in terms of search and research purpose.

bnj on Aug 12, 2021 | [–]

Well when Apple releases the next OS it will automatically OCR all images, so one possibility is just downloading them all on an Apple device.

AaronNewcomer on Aug 12, 2021 | [–]

It already is. I do text searches all the time here and have paid for a subscription for awhile now.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact