Hacker News new | past | comments | ask | show | jobs | submit login

Part of the problem is that Google's PageRank algorithm is patented and the patent won't expire until 2017.

Google insists that its current search mix is based on a lot more than just PageRank, but it seems that PageRank probably contributes the foundation of their business. I don't see a way of competing with Google's results unless we are allowed to use something like PageRank, which we can't do unless we pay royalties or wait until it expires.

I'm not saying that PageRank is the be all end all of search algorithms, and certainly someone somewhere could come up with a different method of ranking superior to Google's. Ranking pages by how many other pages cite them seems like a pretty fundamental insight, and where I would start with any new search engine.




PageRank is almost certainly only a signal used in a learned model. An important signal, but probably one feature in hundreds, if not thousands. It was a critical algorithm to helping them overcome Yahoo in the 90's, but I doubt it is as essential these days.

What is the most important signal? Click-throughs. This is why any new search engine is at a massive disadvantage.


As you said the PageRank is just one of the many features used by the ranking algorithm. I highly doubt that there is any patent issue here (it's "just" an eigenvector computation), and there is a ton of literature and practical evidence that you can build very good web-search ranking technologies without PageRank. It would seem very surprising to me that one can patent a feature used in an algorithm for commercial purpose...


Two facts:

a) PageRank is just one of several hundred factors used for determining ranking. Search engine ranking is a lot more complicated than even most information retrieval-programmers tend to think.

b) There may have been a window of a few weeks or months in 99/00 where not every major search engine used some form of link-based ranking, but it was, as noted, a very brief period.

It always amazes me that people for almost 15 years now have believed in the myth of PageRank's uniqueness and power. I would like to challenge people to think about two things:

- how useful do you think a pure static ranking of web tens of billions of web pages is? Think about what it represents. What does it mean to assign one page higher rank than another?

- do you really believe that other search engines would not have implemented PageRank or something similar? Do you really think that search engine designers do not read papers and apply every trick in the book that they can manage to implement in a scalable manner?

There are lots of hard problems you need to solve if you want to build a web scale search engine. If ranking becomes your biggest problem: that would be a luxury. The biggest hurdle today is money to buy computing power and storage. The time when you could build a competitive web scale search engine for regular startup-money is over. It has been over for close to a decade.

It saddened me greatly when Yahoo threw in the towel because it meant that search in the western world was effectively a two-horse race. And once you get off that horse there is no getting back on it again without some seriously heavy lifting.


It's not really a static ranking, though, is it? The PageRank is constantly recalculated when incoming links change.

I don't know what other search engines have implemented because none of them will let me see their code :-) I do believe if they could implement a link-based ranking system without fear of being sued by Stanford they would. I don't know how different a link-based ranking system would have to be from PageRank to avoid getting sued by Stanford, or whether Stanford litigates this when they suspect unlicensed use. I'm guessing they do sue to defend the patent, because Google's royalties for using the algorithm number in the hundreds of millions, a significant amount towards Stanford's endowment.


The PageRank patent is owned by Stanford; anyone can buy a license. And BTW, it isn't necessary or even that useful these days to compute it.


I'd thought that Google had an exclusive license, but I just checked and the exclusive license expired in 2011:

http://www.seroundtable.com/pagerank-patent-12731.html

So yes, anyone can license PageRank now.


Right. But that is the prohibitive factor for many startups. They cannot afford nor are they willing to pay Stanford for PageRank.


The scientific literature shows that PageRank is not as good as it is cracked up to be, particularly with the TREC style of evaluation.

The issue is that PageRank is a general factor which doesn't have much to do with the question of ("is page A relevant for topic B?") If PageRank causes a popular but irrelevant page to rank above an unpopular but relevant page it is part of the problem, not the solution.


I didn't realise this. Is this really enforceable? One of the tools that Moz sells is their PR proxy, opensiteexplorer. I wonder how they're able to replicate PR without running afoul of this?

Also, is PR really the best, only way to do this? Seems like there are all kinds of better (more modern) signals we could use other than links, which were kind of the only game in town 10ish years ago.


Moz uses the original pagerank research paper published before google incorporated. It is the original seed formula which google later modified and now claims it is hardly using.


Google pays Mozilla hundreds of millions a year to keep its default status in the search bar. Perhaps being able to use PR was part of the deal, or perhaps Mozilla is rich enough to pay royalties to Stanford. Or perhaps Google doesn't mind because Mozilla doesn't seek to directly compete with Google on text search.

Edit: Woops. So none of this is true.


Moz (formerly: SeoMoz) != Mozilla


Woops. Even so Google might not mind because the opensiteexplorer doesn't really compete with its core business.


Google hasn't used PageRank in its original form in many years. They have moved on from PR.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: