Hacker News new | past | comments | ask | show | jobs | submit login

Uh, why is Google releasing this? Wouldn't this code give hackers a good head start to create an OCR system capable of trivially defeating CAPTCHA everywhere?

Or maybe they've realized any human computer test based on text recognition is flawed, and so what better way to force the web to upgrade than to make OCR trivial? I rather like this shotgun approach to AI.

I think the benefits of having high quality free OCR tools available to developers outweighs the CAPTCHA abuse risk. Information organization is a huge problem/area of opportunity, and being able to extract text/content/context out of scans/photos and the like is key.

1. OCRs don't work well with text distorted in typical-captcha ways. They fail with colours especially.

2. Captchas have limited output sets and special characteristics, which make using OCR for them both costly and ineffective in comparison to dedicated solutions. Specifically, you can generate as many perfect sample outputs from a captcha system as you want - and then analyse it in ways beyond the standard character recognition.

As far as I know, no. reCAPTCHA specifically focuses on challenges that are likely to be incorrectly processed by existing document recognition systems.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
