I've found it isn't about "perfection". It is about selecting the same tiles as an "average" person would. I might stare hard at an image, think that one of the tiles contains a tiny fragment of a traffic light, and select it. That isn't what most other people have already done, so the captcha thinks I'm a bot and gives me tougher and tougher challenges. Ever since I stopped pixel-peeping and started quickly selecting the tiles that obviously had a bus in them, the percentage of time that I've gotten by first try has gone way up.
I wonder if this means self-driving vehicles' detection of important traffic features will be at the level of an irritated and disinterested web user who is trying to just do the minimum work to please an algorithm.
Yeah, it's not like they will label something a train just because a single person says so. But if you have 10k responses with 95% confidence saying it's a train, it's very likely to be the case.
For unambiguous images almost all humans will label them the same way. For ambiguous ones humans will differ. Presumably they'll accumulate stats on each image and will be able to detect cases like this.
unless a properly obfusicated bot net has seeded the data set with -everything is a train- responses to the tune of >>10k responses with 95% confidence saying it's a train<<
I was always tempted to knock on people's driver-side windows when I saw them looking at their phone. Never did - figured they'd probably startle, with a non-zero chance they'd accidentally fling the car into me.
I yell at them. Loudly! Loud enough that people a block away turn to look.
But then, when you're staring at your phone while driving your car out of a parking lot and across the sidewalk where you only miss hitting me (before driving into oncoming traffic!) because I stopped, well you deserve that minor inconvenience of being embarrassed.
Sometimes the bus/boat/truck has motorbikes sometimes bicycles. Is that a petrol-powered bicycle, or a motorbike to the [USAmerican?] person who wrote the rules!? Are all large yellow vehicles buses in USA or do you have minibuses, oh wait, are minibuses buses.
I've worked out fire engines are trucks for captchas, not sure about Transit-type vehicles, lorries are trucks apparently but goods trucks on railways are not trucks!
Is a traffic light only the lens/led array or the black light-holder too? Do pedestrian lights count as traffic lights? Are those weird lights hanging in the middle of junctions 'traffic lights'.
Wish they'd just tell you what counts.
I have noticed that times I realise after clicking that I missed a square they tend to go through whilst many times I get repeated captchas when I know I got it right. Success, as a user, seems impossible to predict.
a computer "hemming and hawing" as that one accident where it couldn't decide if it was a bicycle or a person has nothing to do with the training. It's what the developers decided to do with input that had a low confidence score. There will ALWAYS be low-confidence ratings on real world data regardless of how good your training is.
Instead of saying "oh crap there's SOMETHING there we should stop" they said "huh, no let's loop on testing it until we figure it out or run it over....whichever comes first."
I kinda go the other way -- could a FFT heuristic mistake this feature for a crosswalk? Then I'll select it, whether or not it's actually crosswalk. Most of the time, this works. It's a stick in the eye of our prenatal robot overlord.
Replying to an_ko's sibling comment:just like the data behind youtube music recommendations, populated by data carefully analysed from legions of bored toddler clicks vs. Spotify's obsessive teenager music curation
Same experience. Once I observed that most (about 9/10) times there are only 3 tiles to select, I stopped looking for a 4th and selected only the 3 most obvious.
I've noticed similar. Often with stop lights, where a tiny sliver of one does not neatly fit in the frame, spilling over ever so slightly to the next square which has no stop light otherwise. There's a none too subtle irony in that one is being punished for accuracy when the context is ultimately public safety.
This. Captcha wants me to choose crosswalks and you can see there’s that sliver of a crosswalk in a few pixels off in another tile. You’re not wrong! But you’re not right. Regression to the mean.
A hexagon would be better as a frame instead of a square.