Hacker News new | past | comments | ask | show | jobs | submit login
Karma-rank (Pagerank for social sites)
25 points by ntoshev on June 30, 2008 | hide | past | favorite | 22 comments
There is a direct application of Pagerank to social sites like news.YC that seems to have a number of advantages over the typical "one user, one vote" scheme.

I won't explain how Pagerank works, there is a good explanation here:

http://www.ams.org/featurecolumn/archive/pagerank.html

Let's drop the "every vote counts equal" scheme. Say you want to get a global measure for how important every person's votes are to the site. This means you want to assign a number to every person. This is analogous to the pagerank number assigned to a web page. The votes a person has got on all his comments and submissions corresponds to the inbound links of a webpage. The votes a person gives correspond to outbound links from a webpage. You would like that the authority every user gives equals the authority every user receives (both via upvotes, let's ignore downvotes for now). Every user's importance gets divided equally among the submissions/comments that he has upmoded. These are exactly the rules Pagerank operates on, except that one user may have upmoded another several times and you need to account for downmods too.

What are the advantages this would bring to sites like news.YC? Such an algorithm would be pretty resistant to voting circles in the same way pagerank is resistant to spam (one user with lots of karma will weight more than a lot of sockpuppets with no karma). It gives a global estimate of people's importance to the site and it should allow a site to preserve it's culture better as it scales.

Drawbacks? Some people may object that it is unfair not to count votes equally. It is harder to implement (especially as you need to modify the karma-rank on the fly, although if this turns out to be hard, you can use an approximation and recalculate karma-rank daily, for example). Last but not least, it puts serious performance requirements on the site implementing it that Arc is probably not ready to handle.




I've thought of doing something like this. The computational load wouldn't be a problem, because weights could be calculated asynchronously and wouldn't have to be especially up to date. Maybe I'll try calculating but not using it, and see for myself if it produces better frontpage rankings than we'd have otherwise...


I'm thinking out loud here and this has been mentioned before (although not really discussed) how about allowing people to spend their karma?

ie If I think something is worth 20 karma I give it 20 of my karma. If I think something should lose 10 karma I have to pay 10 karma to remove it.

It sounds scary as it could introduce a lot of volatility. The main benefit however would be that people are a lot more judicious when voting as it comes at a price.

I'm not saying it would work but it would be bloody interesting. Obviously there would have to be a source of new karma or else we would experience deflation. It could be trialled for a month or two with an agreement to go back to the previous system and to restore previous karma levels should things go tits up.

Thoughts?

Edit: forgot to say that removes the need for any potentially large amounts of computation to calculate weightings.


Allow karma transfer here and we have the beginnings of a reputational economy...


Exactly. Do you think it would work? I do. I'm happy for the users here with the strongest reputation to have the largest influence.


What you would need to commit to is a frontpage that is personalized, which might not be exactly what you are really looking for. There is some benefit to having a "shared" sense of community. The most recent design struggle I have wrestled with related to this is preventing an echo chamber of sorts from developing. A few quick observations related to how HN works now is that you would probably need to make the downvote more readily accessible to every user and might gain some benefit from "seeding" the network a la Advogato's initial trust sources.


Pagerank sets global weights; you won't get a personalized homepage if you implement just that.


Strict pagerank does set global weights (necessitating the "seed" values that I mentioned in another comment), but this is not a requirement of flow-based weighting systems. If you implement a canonical pagerank algorithm then you will get a global view based upon what certain people feel is important/worthy, and as you expand the set of seed users you will end up with complete personalization. There is a continuum here and deciding where to reside within this range has some pretty large effects on the nature of the site.

Google determines global pagerank because they lack the computational horsepower to do otherwise and because web pages and users are not similar entities, but on a site like this where the links between users and articles are effectively a single measurement this sort of individual pagerank is possible.


Ideally it should be a front page personalized for PG, and we all should share his experience :)


I ran some experiments with this. The results are here:

http://news.ycombinator.com/item?id=232882


This would create a feedback loop. Users that exhibited a certain type of behavior would quickly amass a large amount of karma.

Unfortunately Pagerank is not inherently resistant to spam. To extend your metaphor, this would be good if we had a lot of "Ms. Wikipedia's" but bad if we had more than a few "Mr. Linkspam's."

It would discourage new users, as they would likely never catch up to early adopters. The A-List blogger phenomenon is a good example of this. Big, popular blogs stay big and popular. While possible, it is extremely difficult to break in to the upper echelons of blogging.

It would also reward those who post and vote on topics that are mainstream, at the expense of diversity in the community.

The normalizing effect might be particularly well suited for some specific communities - financial advisers for example, or those dealing specifically with popular culture.


The social sites are much more dynamic than the web. This means, for example, that people high in the ranking like pg would upvote newcomers and they would instantly get a karma boost.

Pagerank is usually log-scaled because of these feedback loop effects.


>>It would discourage new users, as they would likely never catch up to early adopters.

Adding a time based decay to the rank could address that. I.e. my vote for you 1 month ago is worth less then the vote 1 min ago...


You can do one more step and build a recommendation system that tracks not only trust but individual preferences. Performace requirements are pretty steep for real-time recommender, but it is totally worth it.

Oh, by the way, we have it implemented already on jaanix. For an opensource implementation you can take a look at http://www.advogato.org/


Do you have implemented the described system, a recommendation system, or both? If you have implemented similar global estimation of how much every vote counts, I'd be interested in hearing how did it actually affect your site.


It is both. Trust and recommendations are mixed together in an SVD like algorithm as described in http://sifter.org/~simon/journal/20061211.html . On top of that jaanix lets you adjust all your personal preferences in real-time.

The biggest effect is that it is discouraging trolls and spammers, as there's no front page they can spam or troll. The downside is that the critical mass required to make the site truly social is a lot higher, and unfortunately we have not figured out how to gain enough users for it to really shine.


I agree recommendations are a way to take trust into account, but it is a different way with its own drawbacks. You wouldn't need higher critical mass if you have just implemented a pagerank equivalent as opposed to recommendations.

You can make a page without recommendations the default, at least until you acquire critical mass. You can just have a "personalized" slider defaulting to no personalization.


I kid you not: "a website where users could point which users they trust, in a variety of different websites (like reddit, news.YC, OpenID handles, etc) , with the intent of creating a TrustRank, for people", was my proposal for YC this summer.

The idea is in the shelf. If someone wants to work on that, I'd be glad to join and exchange some ideas.


You would need voting data to make anything with the nice properties of pagerank.


As gaika mentioned, the mechanics of this are pretty easy if you want to do a simple pagerank equivalent. There are a few caveats though: pagerank is a positive weighting only so you would need to google up "yaprank" (which I believe is the system jaanix uses) to see an example of how to deal with downvotes, you need to deal with rank sinks and other ephemera of the pagerank system, and you need a network that is large enough to let the ranking system bring real benefits.

As someone who is working on a system like this that can also do realtime rank adjustments I can assure you that the mechanics are simple, the hidden details are subtle, and until you have a community that is active enough for this to make a difference it is hard to say that the effort is worthwhile.


yaprank page is long gone, if somebody still needs this info, please contact me.


Pagerank has had to change many times to combat black hat SEO and other undesirable outcomes. It is esp vulnerable to gaming the system. It has other problems people have mentioned here.

Do you want to wage a constant arms race with the exploiters? Is that really the best use of your time at a start up? Are you gonna expend the amount of resources Google expends on it and still doesn't get it perfect?


Wow, I had a completely different idea based on the title:

Incorporating rankings from news-voting sites (digg, yc, reddit) directly into pagerank for search.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: