Hacker News new | past | comments | ask | show | jobs | submit login

In some use cases, like OCR, the accuracy of these guesses can be established in a scientific way. And it tends to be very good.



I agree; I'd say two things in response, though:

1. However good the guess is, it's still just that: a guess. Taking the standard of "evidence in a murder case", the OCR can and probably should be used to point investigators in the right direction so they can go and collect more data, but it should not be considered sufficient as evidence itself.

2. OCR is a relatively constrained solution space - success in those conditions doesn't mean the same level of accuracy can or will be reached outside of that constrained space.

To be clear, though - I'm making a primarily epistemic argument, not one based on utility. There are a lot of areas for which these kind of machine guessing systems are of enormous utility, we just shouldn't confuse what they're doing with actual data collection.



Did you read that article? That wasn't an OCR issue it was an image compression issue.


I did, and I’m aware it wasn’t OCR that was the underlying problem.

But the issue manifests as characters being incorrectly identified because of an algo t


Same thing in a way. OCR does lossy compression from pixels to text. Both could do similar mistake for pretty similar reasons.


I'm not sure about the OCR example, but there are information / sampling theory limits on what can be discerned in an image, based on sampling rate (pixels basically) and optics. Any extrapolation outside these limits is proveably guessing.

Edit - re OCR do you mean e.g. from a picture of a blurred license plate we could rule in or out a subset of possible numbers, depending on how blurred, like a B could be a 8 but not a L? (And sorry if your example is unrelated). This is valid, and unrelated to super resolution, you can do this analysis with Nyquist and point spread functions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: