Well I think it’s really hard to compare a search engine to another base on what happened behind the curtain. Search engine technology is a really complex matter. Any speculation done from the surface is probably a guessing game.
I will agree though that Hakia seems to be much closer to what they promise to deliver that Powerset. Hakia may have a bit of semantic flavor but remain overall a poor search engine. I always wonder what Powerset is doing with all the money they have raised. I would felt terribly disappointed if I have given them my money. Building a search engine for Wikipedia (not even a good one) with all that money is a little short.
I will take the opportunity here to express my reserve on semantic search. If semantic search is define as a search engines that answer questions, here are two reasons why I think it is not a very promising way for search:
1.It is hard for people in general to type an entire question, users are generally lazy and anything that makes them think is not potentially good.
2.The language factor. Though the web is mostly written in English it will be a challenge for these companies to implement a semantic search in every language. From English to French there’s a whole new world.
I've never really paid attention to this field, but I just tried the sample queries from http://20bits.com/2008/05/12/powerset-launches-verdict-meh/ in Hakia. As far as I can tell, Hakia only answers the last question correctly without having to dig through the results.
To clarify, I didn't mean to say that the results returned by hakia don't contain the correct answer. For powerset/google the answer to many of the questions seemed to be right in the title or description of the results, without having to click on the answers, and that wasn't the case for hakia.
Don't be quick to dismiss Powerset's indexing speed. It is relatively easy to improve the efficiency of a known algorithm. Likewise, it is easier to develop an algorithm if efficiency constraints are defered.
It is relatively easy to improve efficiency within one module by 10% and occasionally by a factor of four or more. If the average system performance improves by 10% with each change then a team could eventually improve efficiency by a factor of 2^10.
Yes, many small changes. Furthermore, this is a pessimistic case. If your average improvemented was 12% then you'd require significantly less improvements.
Hakia seems quite excellent, actually. I suspect the defaults are better than google -- words are much more 'closely bound'. When I ask a question, I seem to get back results in the form of an answer.
But we can't figure out just how useful the tech Powerset is, because it indexes so little of the web. Were I looking for an answer on wikipedia, I'd have probably looked there anyway.
Indeed even the best search engines are only going to be able to answer that question if it is stated somewhere. Hakia can do that. However no search engine is going to be able to figure out the answer by, for example adding up all the times he appeared in a given year. So if someone wrote somewhere "john mcain was a guest on the daily show seven times in 2008" hakia should be able to give you an answer.
There's a bigger problem there. For the current year, the quantity could change. Old, incorrect values would outweigh new, correct values. So, you still wouldn't get the right answer.
"Who, what, when and how do" queries can be answered with current AI techniques but "how many" queries are much harder.
I will agree though that Hakia seems to be much closer to what they promise to deliver that Powerset. Hakia may have a bit of semantic flavor but remain overall a poor search engine. I always wonder what Powerset is doing with all the money they have raised. I would felt terribly disappointed if I have given them my money. Building a search engine for Wikipedia (not even a good one) with all that money is a little short.
I will take the opportunity here to express my reserve on semantic search. If semantic search is define as a search engines that answer questions, here are two reasons why I think it is not a very promising way for search:
1.It is hard for people in general to type an entire question, users are generally lazy and anything that makes them think is not potentially good. 2.The language factor. Though the web is mostly written in English it will be a challenge for these companies to implement a semantic search in every language. From English to French there’s a whole new world.