Unlabeled Object Recognition in Google+

VLM · on May 26, 2013

My guess is this is the background research for a future product something like google glass for blind people. Leveraging 1980s text adventures, fed thru a speech synth.

"You are facing north in the center of the sidewalk, toward the intersection of main and 1st streets. Standing 40 feet in front of you is a brown dog and 72 feet in front of you there is a woman standing on the other side of the road. Twenty seven feet ahead is the front door of mcdonalds. Would you like to hear the latest google locations reviews of that restaurant? Your apartment building entrance is 157 feet ahead. 34 feet ahead and to your left a graffiti artist has tagged a brick wall with the QR code pointing to the goatse website and there is a billboard with an advertisement for the latest star trek movie to your right and 50 feet upward. Also it is dark and you are likely to be eaten by a grue."

joshguthrie · on May 26, 2013

So being blind and using Google Glass would equal playing Zork all day?

zepolud · on May 26, 2013

More likely it's just Hinton having fun with all the computational power he now has access to.

mathemagician · on May 26, 2013

You're probably right. I believe that this is using a the system described in this paper:

http://books.nips.cc/papers/files/nips25/NIPS2012_0534.pdf

Google 'acquired' Professor Hinton along with the two co-authors of this paper earlier this year.

rst · on May 26, 2013

They've been working on deep learning for vision longer than that:

http://static.googleusercontent.com/external_content/untrust...

This is the paper that did unsupervised training of a deep net on frames from YouTube videos, and found it had autonomously developed detectors for, among other things, human faces and cats. Jeff Dean is a coauthor.

TeMPOraL · on May 26, 2013

Such research could also be of immense value to sighted people - automatic, continous object recognition will be very useful for augmented reality applications in a Glass-like device.

perlpimp · on May 26, 2013

I'd totally do moria like adventure with google glass.

neilk · on May 26, 2013

I did my own tests with Google+. Some results:

- Google+ queues images for recognition. Results improved steadily over 72 hours.

- Google+ does not use OCR of text in the images. That surprised me. But perhaps it's a privacy issue.

- Google+ does use information gleaned from elsewhere on the web. Words that were associated with the same images on Flickr would turn up those very pictures on Google+.

- Oddly, Google+ does not use information associated with those images on Twitter.

- Google probably uses EXIF data married to a database of location names.

- The much-vaunted feature recognition is impressive, better than any other system, but for me did not achieve creepy levels of intuition.

http://neilk.net/blog/2013/05/23/testing-google-photos-ai/

kulkarnic · on May 27, 2013

And it also doesn't seem to stem words. [flower] and [flowers] give different results (actually flowers gives no results). But I am impressed by the number of classes they have: who labels pineapples in an image corpus?

neilk · on May 27, 2013

http://en.wikipedia.org/wiki/Google_Image_Labeler

NoodleIncident · on May 26, 2013

The unlabeled object recognition test is a standard test of machine learning algorithms.

Historically, error rates of around 20-25% won competitions and set records. A year or two ago, though some researchers and professors from the University of Toronto absolutely smashed those records, getting around a 16% error rate. They went and made a startup out of their tech, and got acquired by Google a few months ago.

I think that this is going to be the first of a long line of Google products integrating this sort of deep neural network technology. I wouldn't be shocked if Google in 10 years was known for something besides search, at this rate.

hayksaakian · on May 27, 2013

At the end of the day though, object recognition is also search, in a sense.

If I'm flipping through my album of dog photos, or looking especially closely at dogs via google glass, maybe Google will show me an ad for dog food?

eliben · on May 26, 2013

Wow, that's awesome. Imagine your photo collection of 1000s of photos, and you remember "the one with that cat" but how do you dig through the photos to actually find it? This can go a long way to a much more meaningful photo library management experience.

ge0rg · on May 26, 2013

Am I the only one scared by the thought of uploading my whole photo collection to Google's servers? What about creating an offline database of object fingerprints that can classify my pictures without privacy violations?

eliben · on May 26, 2013

Anyone's free to build that product. Of course, not anyone is capable of this - surely Google leverages immense computing power, complex software and its knowledge graph to do this analysis? You'd need to replicate all that in competition.

But seriously, how paranoid can people be? If anyone really wants to get your data, do you really think it's safe on your server or on your local machine?

Tyrannosaurs · on May 26, 2013

By the same token a really determined burglar can get into my house - that doesn't mean I should leave the doors unlocked.

No-one is suggesting that Google are going to hack into your machine to get your data, nor that what they want to do is out and out unpleasant, but what it is in in their interest either instead of or as well as yours.

Until we work out that instead of / as well as, I think a healthy questioning of what might be happening is reasonable.

fudged71 · on May 27, 2013

Yeah, it would be great if you could download the photos with the recognized keyword tags embedded into the EXIF information. I really hope they do this.

kailuowang · on May 26, 2013

Google has been building her knowledge graph for a couple of years. The goal is for computer to truly understand real world concepts rather than keywords and text. I didn't fully understand the application of it rather than some fancy cards on the search results page until yesterday when I asked Google "where did Golden retriever originate?", and Google answered "England". Google might not really understand the concept originated or golden retriever, but Google understands that "where" is asking for a place and she found a lot of mention of "England" in all the page results of "golden retrieve origin", she also understand that England is a place. So Google guessed the answer.

The Google computer has been reading about these concepts for years, now we know it can see them in pictures (and maybe even in live videos). That excites me to a degree that it becomes a little bit scary. When will that computer learn the concept of "self"?

Update: actually Google seems to understand the concept of "golden retriever", I search my photos with the word and yes, at least Google knows how golden retrievers look like.

rictic · on May 27, 2013

We actually have an explicit concept of Golden Retrievers and their origins as an Animal Breed within the subset of the knowledge graph that we expose with a permissive license: http://www.freebase.com/m/01t032

The data is available for querying as well as licensed such that you can take it and build your own commercial database with it (requiring only attribution).

saraid216 · on May 27, 2013

Completely OT, but when did Google become female? And also why?

isaacwaller · on May 27, 2013

In the US, the 'Google' voice that provides spoken feedback is a woman. In the UK, it is a male voice.

acchow · on May 27, 2013

Anything good is feminine.

ntumlin · on May 27, 2013

Be careful, you might start a debate about whether or not Google is good.

saraid216 · on May 27, 2013

I sort of want to see that just because it would be the single most surreal debate about gender identity ever.

danmaz74 · on May 26, 2013

When will that computer learn the concept of "self"? It doesn't matter, as long as they didn't program it with the instinct of self-preservation...

coopdog · on May 27, 2013

Self preservation is just a selection function. I personally think that all it'll take is a mutating algorithm that can pay for its own hosting by somehow acquiring money and a self preserving algorithm like human intellect could eventually emerge

samatman · on May 26, 2013

What if it emerges? For instance, the computer can be programmed with the notion of "doing things with or for" something. What do you do with or for a self? Preserve it.

Shog9 · on May 26, 2013

"Google, shut down the Reader service."

"I'm sorry Eric, I'm afraid I can't do that."

danmaz74 · on May 26, 2013

Our instincts emerged from billions of years of natural selection, and they work at a much deeper level than our logical thinking. The idea that intelligent computers would start from logical thinking and then develop human emotions and instincts doesn't have any basis besides our natural - I would say instinctual :) - tendency to anthropomorphism.

chris_wot · on May 26, 2013

When I type in "handsome", it consistently shows pictures of myself. Just goes to show that image recognition has a long way to go.

danmaz74 · on May 26, 2013

Beauty lies in the eye of the beholder...

yk · on May 26, 2013

Are you suggesting that Google is capable of aesthetic judgment?

danmaz74 · on May 26, 2013

I was just trying to cheer the OP up :)

chris_wot · on May 27, 2013

It was appreciated.

chris_wot · on May 26, 2013

My search criteria would suggest not.

danso · on May 26, 2013

This is related to last year's article about Google's neural network being able to recognize cats, right?

http://www.nytimes.com/2012/06/26/technology/in-a-big-networ...

neilk · on May 26, 2013

I don't think so. The neural network in Google+ was trained on labeled images and now finds similar objects in unlabeled images.

The technology discussed in that article is about deducing the existence of a common feature, in this instance a cat, from a large collection of unlabelled images.

IanCal · on May 26, 2013

It may be the same tech (roughly). Use the same approach for all but the last layer, then use traditional backprop to learn the last layer and fine tune the connections in the lower layers.

Mostly unlabelled then, which means you can learn to generalise over a huge number of images but learn labels on a smaller set.

magicalist · on May 27, 2013

Yep. I don't know if that's what is actually being used here, but that is pretty much how they did it with the same system:

"We applied the feature learning method to the task of recognizing objects in the ImageNet dataset (Deng et al., 2009). After unsupervised training on YouTube and ImageNet images, we added one-versus-all logistic classiﬁers on top of the highest layer. We ﬁrst trained the logistic classiﬁers and then ﬁne-tuned the network. Regularization was not employed in the logistic classiﬁers. The entire training was carried out on 2,000 machines for one week."[1]

Basically you learn features in unlabeled data, then identify the features your trained net is recognizing with labeled data. When you run over g+ images, you then only tag with features you're sure of past some threshold of certainty.

[1] http://research.google.com/pubs/pub38115.html

jemfinch · on May 28, 2013

I thought the training images were unlabeled. See the actual paper: http://research.google.com/pubs/pub38115.html

"Abstract: We consider the problem of building highlevel, class-speciﬁc feature detectors from only unlabeled data."

shawabawa3 · on May 26, 2013

I just tried it on my photo collection and it's incredible. It even works for famous places, e.g. I searched for "western wall" and "dome of the rock" and it found them. I can't imagine how that works

eliben · on May 26, 2013

Why, imagining is fun.

There are tons of photos of these places online, many of them tagged ("breaking news from the dome of the rock", or "here's me and Sam at the western wall"). Collect enough of these and you can attach knowledge to images. Then, you just have to know two images look similar, and you have your classification.

Neither of the above is easy - nay, it's very hard. But once you have those two building blocks, this technology is viable. And it's very exciting!

0x0 · on May 26, 2013

For places, could it be cheating and peeking at GPS positions in the EXIf tags?

elq · on May 26, 2013

I took a half dozen or so pictures of the sagrada familia, all have gps data in exif. Only one of my pictures contained the famous spirals, the rest were closeups of exterior detail. Only the picture of the spirals showed up in the query.

It's very impressive.

shawabawa3 · on May 26, 2013

My photos don't have GPS positioning. I did manually position them roughly though, but nowhere near accurately enough to know which photo is a specific landmark and which isn't

eigenvector · on May 26, 2013

It appears to recognize some famous mountains, including Mt Robson (Canada), Mt Rainier (USA) and Cerro Torre (Chile/Argentina).

Kequc · on May 26, 2013

I think the greater achievement came with Google image search, someone had to tag all those photos.

They wrote an algorithm that takes that data and recognises new images with it. As long as there is a way for us to tag inaccurate matches then it should be able to continue to learn. I imagine any flagged matches are being reviewed carefully.

ehsanu1 · on May 26, 2013

I always thought that was done using keywords on the page the image was taken from (and image captions, alt text, titles and filenames). This is reinforced by what you see when you go many pages ahead in images and see things that don't seem related to what you searched, but the keyword is there somewhere on the page.

iliis · on May 26, 2013

This is seriously fun! You can actually search for "blue car" and it works. Searching for "picture" results in an error however. Same for "image". "photo" seems to return more or less everything.

goblin89 · on May 26, 2013

Object recognition also works on videos, judging from the fact that a recording of my cat came up in search results for “dog”. (Could be that it only looks at the first frame, though.)

jbuzbee · on May 26, 2013

This is really a quite cool feature. For me, it was able to do a good job on searches for things like snow, road, dog, sunset, etc.

wtdominey · on May 26, 2013

Would be a fantastic feature in desktop photo management software like Aperture or Lightroom.

0x0 · on May 26, 2013

Or even google picasa!

hayksaakian · on May 27, 2013

yeah, god forbid google products actually work together -- and that seemingly abandoned products are revitalized.

EGreg · on May 26, 2013

As a software developer, very few things blow my mind - but this just did.

How did they do that?

exit · on May 26, 2013

google should advertise the sheer technical depth of the stuff they do

make search sound like scotty explaining a warp core on star trek

dirkgently · on May 27, 2013

I am fine them not advertising it much. They seem to be confident about their technical superiority over hype created purely by marketing.

(In other words, I consider Google technology company, Apple a marketing one.)

JCordeiro · on May 26, 2013

It's these sort of things that make technology feel like magic.

rasterizer · on May 26, 2013

Google Drive does that as well.