Hacker News new | past | comments | ask | show | jobs | submit login
How Google Cracked House Number Identification in Street View (technologyreview.com)
59 points by nkurz on Dec 28, 2014 | hide | past | favorite | 24 comments



This is a tangent, but:

"That's particularly useful in places where street numbers are otherwise unavailable or places such as Japan and South Korea where streets are rarely numbered in chronological order but in other ways such as the order in which they were constructed, a system that makes many buildings impossibly hard to find, even for locals."

South Korea finished renumbering streets in 2011 and after two and a half years of trial completely switched to the new system in January 2014.


The sentence doesn't make sense. Japan and (apparently formerly) South Korea use chronological rather than geographical ordering.


Yeah, obviously "chronological order" in the original sentence is a mistake for "geographical order".


> order in which they were constructed

Seems very impractical. What's the benefit?


It prevents any rework from having to renumber the buildings if things are being built up heavily over time. Kind of like an unsorted array vs a sorted array. The unsorted one is a pain to search, but you can just slap things on to end without a worry.


The sane solution for that is to just use "number of meters since start of street" as the building number. That way you can always fit new buildings between existing ones and get the bonus that if you are at building 100 and want to get to 200 you know that's 100 meters away. Usually you'd also do even numbers on one side of the street and odd on the other to make it easy to know which side to look at. This means you can't have doors closer than 2 meters apart which is usually ok. If you really need more doors then 100A and 100B are the usual hack.


In Florence, Italy, in the historical city center, we have an unique numbering system; each street has two series of independent numbering of buildings, differentiated by colors: red numbers are for businesses, black numbers for houses. So for instance a restaurant could be located on the number "23r" (r=red), while the standard "23" (black) can be hundreds of meters away in the same street.

I think there is currently no mapping system that handles this madness. Google Maps still does a decent job if you're looking for a specific place, because people have reported the exact gps positions of most businesses through user-reporting, but if you enter an address with a red number, you're unlikely to be correctly directed.

I guess the neural network knows nothing of colors...



Venice has some odd numbering scheme, too: http://en.wikipedia.org/wiki/House_numbering


I haven't read the paper yet, but I don't think this is even the biggest CNN inside Google. The NIPS2015 Hilton/Dean paper talks about a single network trained for image classification for six months on a large number of cores.


This article is almost a year old(January 6, 2014)


I had thought they just used Captcha to "turk" it out to unwitting users. After all, they've been using house numbers in Captcha for a while.


It would have been good if the article explained the link between this work and house numbers appearing in reCaptcha. Training the network perhaps?


Personal experience suggests that they are also using CAPTCHAs for the same purpose. I wonder how that figures in to the project.


Perhaps it doesn't, they might have just used that as a source for "things we know are too hard for everybody but us". They weren't presenting houses with dummy words like they were for book solves, so it seems unlikely they were using it to train with unwitting human inputs.


I didn't appreciate doing this work for Google so I gave the wrong answer every time. It passed me through the captcha every time.


If it's choice between doing work for Google and solving those meaningless random ones that that contribute to nothing I'll help Google anyday.


That's a false choice.

The real choice is between doing Captcha work for Google and not doing Captchas.

In this respect, you're welcome.


Strange, I also tried giving a close but wrong answer a few times and it never accepted it.


Of course it lets you through, it can't check that (at that time)

But it doesn't mean you're the only one that got that sample. So they pick the most "popular" answer


It is likely this was used to form the training set of 200,000 observations that was used to train the network.


The article just skims over this part:

"To start off with, Goodfellow and co place some limits on the task at hand to keep it as simple as possible. For example, they assume that the building number has already been spotted and the image cropped so that the number is at least one-third the width of the resulting frame. They also assume that the number is no more than five digits long, a reasonable assumption in most parts of the world."

This seems like a huge task. Someone has to go through all the thousands of images and first crop them? During that time, it would seem like they could just input the number into a database.

Maybe I'm missing something, but I read the "cracked" part to be a totally automated system that scans all the pictures and pulls the numbers with no human manipulation.


Of course cropping is also automated, but using the different algorithm.

Text detection and text recognition is a different problem. Text detection is usually solved by stroke width transform. The article focuses on text recognition using the neural network.


[deleted]


Perhaps the nueral net that Google uses to choose research areas decided to learn how to locate human dwellings?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: