Hacker News new | past | comments | ask | show | jobs | submit login
Media Darling Powerset vs. Non-Media Darling Hakia (whydoeseverythingsuck.com)
18 points by prakash on May 12, 2008 | hide | past | favorite | 15 comments



Well I think it’s really hard to compare a search engine to another base on what happened behind the curtain. Search engine technology is a really complex matter. Any speculation done from the surface is probably a guessing game.

I will agree though that Hakia seems to be much closer to what they promise to deliver that Powerset. Hakia may have a bit of semantic flavor but remain overall a poor search engine. I always wonder what Powerset is doing with all the money they have raised. I would felt terribly disappointed if I have given them my money. Building a search engine for Wikipedia (not even a good one) with all that money is a little short.

I will take the opportunity here to express my reserve on semantic search. If semantic search is define as a search engines that answer questions, here are two reasons why I think it is not a very promising way for search:

1.It is hard for people in general to type an entire question, users are generally lazy and anything that makes them think is not potentially good. 2.The language factor. Though the web is mostly written in English it will be a challenge for these companies to implement a semantic search in every language. From English to French there’s a whole new world.


I've never really paid attention to this field, but I just tried the sample queries from http://20bits.com/2008/05/12/powerset-launches-verdict-meh/ in Hakia. As far as I can tell, Hakia only answers the last question correctly without having to dig through the results.


with all due respect this is just not true. I just did the first few tests and it seemed to nail all of them in the first answer or two in the list.


To clarify, I didn't mean to say that the results returned by hakia don't contain the correct answer. For powerset/google the answer to many of the questions seemed to be right in the title or description of the results, without having to click on the answers, and that wasn't the case for hakia.


I love how the first result for "2 + 2" on Hakia says that 2 + 2 = 5. http://www.hakia.com/search.aspx?q=2+%2B+2

Seriously, Hakia seems useful (if slow), and Powerset seems useless. It'll be a shame if marketing wins this game.


Don't be quick to dismiss Powerset's indexing speed. It is relatively easy to improve the efficiency of a known algorithm. Likewise, it is easier to develop an algorithm if efficiency constraints are defered.

It is relatively easy to improve efficiency within one module by 10% and occasionally by a factor of four or more. If the average system performance improves by 10% with each change then a team could eventually improve efficiency by a factor of 2^10.


Uh, maybe I'm missing something, but how exactly do you go from 10% increases and end up with a 1000-fold efficiency increase?

Unless you're saying they can make 75 such changes, each giving a 10% increase...


Yes, many small changes. Furthermore, this is a pessimistic case. If your average improvemented was 12% then you'd require significantly less improvements.


Hakia seems quite excellent, actually. I suspect the defaults are better than google -- words are much more 'closely bound'. When I ask a question, I seem to get back results in the form of an answer.

But we can't figure out just how useful the tech Powerset is, because it indexes so little of the web. Were I looking for an answer on wikipedia, I'd have probably looked there anyway.


If we subject Hakia to the same (very limited) tests that the other article about PowerSet was using, Hakia often performs even worse.

For example, here's Hakia's results for "Who is the President of France?":

http://www.hakia.com/search.aspx?q=who+is+the+president+of+f...


When I can get an answer (bonus if it is correct) to queries like 'How many times has John McCain appeared on the Daily Show?', I'll be impressed.

One weak area in Google's armor is taking time into account, eg: 'How many times did John McCain appear on the Daily Show in 2006?'


Indeed even the best search engines are only going to be able to answer that question if it is stated somewhere. Hakia can do that. However no search engine is going to be able to figure out the answer by, for example adding up all the times he appeared in a given year. So if someone wrote somewhere "john mcain was a guest on the daily show seven times in 2008" hakia should be able to give you an answer.


There's a bigger problem there. For the current year, the quantity could change. Old, incorrect values would outweigh new, correct values. So, you still wouldn't get the right answer.

"Who, what, when and how do" queries can be answered with current AI techniques but "how many" queries are much harder.


I have not looked at Freebase in a while, but, theoretically they could -- assuming all the info is populated in freebase.


It'd be interesting if search engines allowed a rudimentary query language for stuff like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: