I hate to be negative, but I found the announcement post suspiciously devoid of any acknowledgement of the huge work that has been done in this field over the last few years. Wolfram's post seems to be written in the tone as if he's suddenly discovered a way to do this, when it's old hat. No mention of the results and ILSVRC (Large Scale Visual Recognition Challenge) contests held every year. Or ImageNet and all of the associated projects.
TL;DR
[quote]
Although it is clear that Wolfram is no crank, not someone skeptics would label a pseudoscientist, skeptics will notice that, despite his flawless credentials, staggering intelligence, and depth of knowledge, Wolfram possesses many attributes of a pseudoscientist: (1) he makes grandiose claims, (2) works in isolation, (3) did not go through the normal peer-review process, (4) published his own book, (5) does not adequately acknowledge his predecessors, and (6) rejects a well-established theory of at least one famous scientist.
[/quote]
Just tried it out with the same pics I used for Wolfram. For one of the previously wrongly identified pics, Clarifai produced better results. For another, it miscategorized it with the same tags as Wolfram.
Impressed to see that it doesn't necessarily get all the sample images provided on the site right - obviously would be tempting in a tool like this to pick images there that are guaranteed to be correct, but for example the image of a bicycle wheel disc brake gets classified as a 'bicycle chain', and the typewriter as a 'computer keyboard'. Good to demonstrate the failure modes, rather than attempt to project complete infallibility.
I also like the secondary classification it does of people, looking for a 'notable person' match.
R, Python and open software applications are going to be a very fierce contender in machine learning, image recognition, data mining and statistics. I wonder if close source is going to compete with the 5000 packages available in R and in the future more and more people are going to be able to improve on the shoulder of giants.
Well, the philosophy for the Wolfram language is different. Basically, their approach seems to be "put everything imaginable in the standard library", which is kinda awesome, actually.
Discoverability of functionality in the system, and common ways to put things together, is an important topic. We (I'm an employee) are constantly improving our documentation system to make it easier to see what's there. With over a hundred thousand examples there's a lot of ground covered but more initiatives in this area are rolling out.
I gave it a pic of me, and it returned "person" - good.
I gave it a pic of my wife at a White House dinner, and it returned "construction" - a definite "What the heck?". I suppose this could have been a subtle political comment, but whoosh, I don't get it...
I gave it a pic I took of an F22 in a 9g turn (in afterburner) and it returned "afterburner" - surprised me that it ID'ed the aircraft's propulsion regime rather than calling it an aircraft, but nonetheless impressive. How do they get it to key on Mach diamonds?
Bookmarked. I'll check in occasionally to see how it's doing in its ascent to sentience.
I'll check in occasionally to see how it's doing in its ascent to sentience.
Turing Test: Pair it up with a chat-bot algorithm on Skype and see if it can start conversations with random users based on their home decor. "Nice red leather couch you've got there."
Interesting. Tried out a few pics from my desktop and it got some right and a couple wrong but I could see why in one case (ottoman cushions identified as "containers", probably based on shape).
It would be really cool to see a refined use case for artifacts, i.e. detailed scanning and identification of sculpture by style and origin.
It seems like it doesn't work with illustrations (i.e cartoons, logos, etc). Also, lights seems to confuse the model; for instance, a picture of my son with candle on a birthday cake was identified as "instrumentation" and a picture of me with a light source in the back was tagged as light bulb.
I have tried with two images, one a book in Spanish entitle: Lo mejor de "Fantasy & Science Fiction", another with my watch. In the first one it classified as a machine or something like that, the second is a device. Nothing fancy and very far from what they say in the blog.
Really cool, I tried with a few photos and it did surprisingly well. I provided training info on the ones it got wrong, hopefully it keeps getting better.
Very impressive and I don't mind helping train it.