I wish he provided more technical detail as to what differentiates this from every other attempt at exposing a general learning algorithm + NLP to the web.
Beside the Wolfram brand-name, there's little reason to elevate this above 'curiosity I'll watch out of the corner of my eye' status.
It sounded to me like it was "This problem requires both brute force and cleverness, and I have supplied both.":
"But with a mixture of Mathematica and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.... But I’m happy to say that with a mixture of many clever algorithms and heuristics, lots of linguistic discovery and linguistic curation, and what probably amount to some serious theoretical breakthroughs, we’re actually managing to make it work."
(Ellipses covers several paragraphs in this case.)
The big stopper in some sense has not been the cleverness, but the brute force necessary, because without the brute force collection and translation, there's nothing to be clever with. I, too, am "wait and see", but this could be valuable.
There is no mention of "learning algorithms" or stuff of that nature in the post. Its a different approach:
"But if one’s already made knowledge computable, one doesn’t need to do that kind of natural language understanding.
All one needs to be able to do is to take questions people ask in natural language, and represent them in a precise form that fits into the computations one can do."
The nlp involved is quite different from the mainstream direction.
Actually, I'm not sure if the Wolfram brand name is a reason to pay attention to this, or to deliberately ignore it. It's being described as a cross between Mathematica and <i>A New Kind of Science</i>; the first is wonderful, the second is a new kind of stupid, and I'm not sure which one will win.
Good idea, but not likely. If ambition was a weak attractive force, like gravity, then ego would be a localized, strong repulsive force, like two positive electro-magnetic forces.
Estimating the ego of Wolfram and Lenat, it would take somewhere on the order of 10^9 ambitious geniuses for the net force of ambition to overcome the local force of ego.
I wonder if this is actually going to work, or if it's going to be like every other failed attempt at this kind of thing - ie amusing at first but far too limited to be genuinely useful.
the difference in approach is outlined in the following paragraphs:
"But what about all the actual knowledge that we as humans have accumulated?
A lot of it is now on the web—in billions of pages of text. And with search engines, we can very efficiently search for specific terms and phrases in that text.
But we can’t compute from that. And in effect, we can only answer questions that have been literally asked before. We can look things up, but we can’t figure anything new out.
So how can we deal with that? Well, some people have thought the way forward must be to somehow automatically understand the natural language that exists on the web. Perhaps getting the web semantically tagged to make that easier.
But armed with Mathematica and NKS I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable."
(eg, the crucial formula is algorithms to compute further information from curated, trusted data)
You do realise that industry has tons and tons of research that has never been published. Most academics I know have never searched the patent database, which is only the tip of the iceberg.
I'm reminded of the hype surrounding Cuil. So, does this system do anything for my mom? She's not really the math-y sort. From Wolfram's blog post, I wasn't able to determine what's in it for me. Or, are we way too early for that?
It seems to me like a cross between Wikipedia with curated data, Mathematica, and Google Calculator. You can ask it a wide variety of things, in natural language, and it is supposed to get the answer right fairly often. We'll see.
Why? Mathematica does not have access to magical algorithms that none of the rest of us know about, nor does it have any particular performance benefits when it comes to implementing specific algorithms. It is a great exploratory tool and an exceptional "generalist" in computer mathematics, but beyond that it does not offers much that can't be done better/faster by a couple of smart people writing some tight application-specific code.
So it seems to me that alpha is going to try to create mathematical models from unstructured data found on the web and match them with similar models generated from natural language queries entered by the users. I'm sure probabilistic NLP methods are used extensively and I can see maybe how Mathematica can compute these quickly. But what the bleep is the role of NKS? From what little I know, I can speculate that he might have modeled statistical inference using CAs and so can constrain the probability space of matches?? Jebus I need to know more or forget what I know already, anything in between is frustrating.
Exited and interesting approach. Adding more schematic will help such a system so it's working with time. Also seems to have the right kind of mad scientists.