Hacker News new | past | comments | ask | show | jobs | submit login
Distributed karma: an idea for fixing recommendation systems (lkozma.net)
13 points by lkozma on Aug 11, 2007 | hide | past | favorite | 18 comments



I and my cofounder used the same idea of users as nodes of a graph for http://oknotizie.alice.it (a system similar to reddit for italian speaking users) in order to indentify groups of spammers and users with very strange behaviour. This information is used in order to decrease the weight of votes in the system for this bad users.

Our experience is that while this works very well against spam it does not stop the quality degradation that happens every time the community gets larger because the most active users tend to become friends and stop voting the news just for their quality.


The problem with giving someone a default high score based on what they've done in the past is it devolves into a type of "appeal to authority" fallacy where deference is because of who someone is, as opposed to what someone is now saying. Trivial, offhand remarks by an authority figure are given greater weight than insightful, useful posts by an unknown. Even worse, out-and-out mistakes by the highly karmic come with an official stamp of karmic approval -- the whole system is prejudicial by design.

If your goal is to create a system that reflects people's typical judgement, then this works, because people make all sorts of logical errors. If, however, you are aiming for a meritocracy, judging each post on its own worth without regard to who said it (except when identity is actually applicable), the correct approach is to have a swarm of AIs reading everything and assigning points based on content. [1]

[1] Implementing the correct approach is left as an exercise for the reader.


What you say is true for comments, I agree that every contribution should be judged on its own merit. For filtering/ordering submissions though, such as on the reco-page on reddit, a web of trust is quite natural. Since we have limited time, we can't possibly read everything, so we might as well prefer content from people we trust. Same thing in real life, when you read the next book of your favorite author, instead of one by a complete stranger. If the scores are distributed, the authority figures aren't necessarily the same for everyone.


Articles should also be read and ranked based on content instead of by submitter. With books, you miss out on a lot by sticking with the same author, as there are so many other worthy books you never become aware of; people's usual habits can be improved upon.


Your issue regarding the appeal to authority is present in real life to a large degree, and not limited to websites.

What you're suggesting is to remove the human altogether, and I'm not sure how AI would adjust when the average total quality reduces, which is usually the issue when a site scales.


> Your issue regarding the appeal to authority is present in real life to a large degree, and not limited to websites.

Correct, which is why I mentioned the current situation works if you want a model that reflects typical human judgement, including bias and errors.

Removing humans is just a labor-saving feature. If you don't mind spending the time, then human editors can do a good job of it, just like with peer-reviewed journals. Make the score independent of the site's userbase size: e.g. from 1 to 10. No -47s or +581s like on reddit. Comments can also be kept in a useful range, instead of something blindingly obvious getting 80 points and something controversial getting 0 (+20,-20).

The key is always the editors themselves. Quality reflects the average of the editors, which is why you see a regression toward the general population's mean when everyone is imbued with editorial ability. I would be a very good editor for a few subjects and not good at all for most others; multiply that by a million and the overlap gets you the current situation with various social news sites. Pick editors randomly and you get Slashdot. Pick editors carefully and you get Science or Nature.


Very good comment. I would give it a 9.5 if I could. ;)


Just out of curiosity, how much of a literature search are YC startups expected to do? In particular, how much of a literature search did reddit do?


There's no rule about it. It's good to know what you're talking about.


Two points:

It seems a question of scope is in order here; what exactly is the purpose of a recommendation system?

Is it a system to forward to users that which they want to see, or is it a system that suggests various opinion of high quality to users? Often, I think we're trying to construct the latter by designing the former.

Secondly, Reddit's system works perfectly to forward to the user what they would like based on what's been submitted - the issue being that the average quality of submission has lowered over time. Even if the system gives you the best POS, you're still stuck with a POS.

The solution: Scaling is the problem, so stop/limit scaling.

We're not seeing a degradation of quality, we're seeing a better reflection of the average opinion - the larger the crowd, the lower the average. We're trying to enforce an expectation of quality that is held by a few on the many; this is impossible! The many don't hold the same regard or opinions as the few.

You can tweak things a little, perhaps come up with systems that use more cpu power than the space navigation does, but the end result will be the same: average opinion wins - exactly what you should expect.

Average opinion isn't what we want though, is it?


This is of relevance:

Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman, "SybilGuard: Defending Against Sybil Attacks via Social Networks." Proceedings of ACM SIGCOMM Conference , September 2006

http://www.cs.cmu.edu/~yhf/sybilguard-sigcomm06.pdf

In effect, this paper provides a way of "scaling" a trusted social network and minimizing the influence of sock-puppet accounts.


There is plenty of research into this, try a search on ACM [http://portal.acm.org/] or google scholar for 'trust reputation recommendation network'.

Reddit could/should have been using this kind of approach for ages (I don't know if they have or not).


Academic work is hard to translate into a good implementation though. For example, if you trust one user more than another user, that means that you would see totally different karma point values than any other user sees. That sounds really expensive to compute.


I imagine there are loads of variations on this theme (i.e. trust networks) that could produce useful recommendations, without the need for unwieldy computation. Sure there'll be a lot of number crunching, its just a matter of choosing the right ones to crunch, and when and how to crunch them.

Also, I'm just talking about delivering a good set of recommendations, anything else is gravy.


Right, but this sounds like a "Beat your head through a concrete wall" sort of problem. Sure, you might be able to do it.. But.. Why? Is the value add really that big?


It depends on the context you are using it in! For social news (e.g. reddit et al.), this sort of thing might be a great help at providing personal recommendations.

Here is a fun recent paper from google, about 'Scalable Online Collaborative Filtering' for their personalised news service [http://www2007.org/paper570.php], using MapReduce to run Expectation Maximization. via [http://www.datawrangling.com/google-paper-on-parallel-em-alg...]


This has been done a lot of times before... one of the simplest examples is Advogato trust metric.

http://www.advogato.org/trust-metric.html


Excellent. It's unfortunate that it's computationally impractical.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: