Hacker News new | past | comments | ask | show | jobs | submit login

They didn't test that claim at all though. Vision isn't some sort of 1D sliding scale with every vision condition lying along one axis.

First of all myopia isn't 'seeing fine details as blurry' - it's nearsightedness - and whatever else this post tested it definitely didn't test depth perception.

And second - inability to see fine details is a distinct/different thing from not being able to count intersections and the other things tested here. That hypothesis, if valid, would imply that improving the resolution of the image that the model can process would improve its performance on these tasks even if reasoning abilities were the same. That - does not make sense. Plenty of the details in these images that these models are tripping up on are perfectly distinguishable at low resolutions. Counting rows and columns of blank grids is not going to improve with more resolution.

I mean, I'd argue that the phrasing of the hypothesis ("At best, like that of a person with myopia") doesn't make sense at all. I don't think a person with myopia would have any trouble with these tasks if you zoomed into the relevant area, or held the image close. I have a very strong feeling that these models would continue to suffer on these tasks if you zoomed in. Nearsighted != unable to count squares.




It seems to me they've brought up myopia only to make it more approachable to people how blurry something is, implying they believe models work with a blurry image just like a nearsighted person sees blurry images at a distance.

While myopia is common, it's not the best choice of analogy and "blurry vision" is probably clear enough.

Still, I'd only see it as a bad choice of analogy — I can't imagine anyone mistaking optical focus problems for static image processing problems — so in the usual HN recommendation, I'd treat their example in the most favourable sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: