Hacker News new | past | comments | ask | show | jobs | submit login
Improving the Hacker News Ranking Algorithm (felx.me)
122 points by manx on Sept 3, 2021 | hide | past | favorite | 54 comments



I'd be happy to see some experiments with ranking algorithms, maybe in a separate site, or as options within HN as someone else suggested. There are some downsides to both, though (potentially low exposure for the first, technical complexity for the second).

What worries me is the definition of quality you use. We look for submissions that we find valuable for us, not necessarily high quality. Interests are varied, and we all get value from different things. Quality might be very correlated with value, but it's not quite the same. And here comes the big issue: maybe more than 50% of the value we derive from HN comes from the comments. I feel many times we are upvoting submissions by sheer relevance, so we can have valuable discussions about a relevant topic. I don't think the given metrics are capturing this. I like the analysis and the proposal, but it's easy to see how there's some very important perspective missing.


> [I]magine two submissions with the same number of upvotes and a different number of views. The article with more views probably has a lower quality than the one with fewer views, because more people have viewed it without giving an upvote. So the ratio of upvotes and views is a signal of quality.

I believe this creates a bias against longer articles. If a submission links to a longer piece, a user could take a long time to come back to HN to upvote the submission.

That's not to say this would be any worse than the current algorithm. By definition, a time-ranked frontpage and moving discussion will always favor shorter articles, or, if the content is long, produce plenty of comments that are only superficially related to the article.

As a submission ages, its rank wanes, and discussion gets more sparse: a piece that takes time to read, understand and digest will probably perform worse than a short one, since the discussion and votes will take place further into the future.


Absolutely true. We thought about it, but didn't have any idea yet. Maybe there's also a balancing feedback loop approach to it. Something like 'longer click to upvote delay' should lead to 'higher rank'.


My probability classes from college finally see some use!

What I would try to do is calculate the expected value of the number of upvotes from the people who have seen it and use that as the metric.

So for any post there should be some number of views and some number of votes. The amount of time from when a user clicks the link to when a user clicks the upvote forms a distribution, because each user is different.

Then you assume that that distribution is a probability distribution for the users who haven't clicked on the post yet, and calculate the expected value using that.

This tries to mitigate the fact that there's a bit more delay in upvotes when reading longer posts.

This doesn't solve the problem of if there are too few upvotes to get a reliable distribution, or basically that it gives no advantage to longer posts when they are first posted. You can solve this with something called a prior distribution, and maybe scraping posted links to determine the length.

Another small issue is users upvoting an article after taking a break from reading, or maybe they are familiar with the link and don't even need to read it before upvoting. Remember we're trying to estimate how long would it take for users to read the article, and in those cases, the user isn't really doing that as much. So you could try to do some filtering or transforming before putting data points of that type into the distribution.


I'd be happy to code up a demo if there's any interest!


+1! I've been working for some time on a dynamic ranking model for a crowdsourcing feed. Would love to see other examples.


Super interesting! I'd love to see what you come up with.


I would be OK if the algorithm would rank submissions according to value/time spent to read vs just total value.


Excellent analysis and I think the proposed algorithm would work well against clickbait titles. HN could trial several different algorithms over a period of a week each and watch the metrics. Thinking out loud:

Part of HN’s charm is that niche content can surface to the top and many here are experts in their field, contributing great insights unavailable in normal media (example: pilots discussing possible causes after an airplane crash). That means the rest of us get to learn something new with perspectives from the best and many newbies become interested in more. That’s what top University classrooms feel like. I’m not sure what metric would quantify that quality (just # of comments as a metric encourages rage posting, maybe # comments with high upvotes?). The current system, although game-able and full of false negatives also has entropy built in - randomness that is more likely to give a wider range of content to the front page. That helps with genetic algorithms, so it might be of benefit, but missed if not quantified. If gaming the system counts - the most persistent also win some more.

Another option to consider is duration of stay on articles. From testing it seems Google factors in duration of stay on content as search criteria as well, which skews rankings towards longer pages. Those also tend to be ad-supported and likely within Google’s own advertising network. Within HN, more comments increase duration of stay, so that would likely increase provocative content, so may not be a good idea, but testing would surface that and more.


Have you tested the behaviour of your algorithm on pagination and dynamics of link position over a day?

I assume the good thing about the snowball effect is that "making it to the frontpage" gives you guaranteed and pronounced exposure over a day or two - even though it's not maximizing the number of good links shown.


Right, and that "pronounced exposure" is actually super important! It takes time for interesting discussions to form in the comments, as new people read the story and jump in to share their thoughts. That doesn’t happen when stories disappear too quickly.


Our simple simulation had a 1-second time resolution, so it evaluated the formula for all link positions over the submission's lifetime. But there was no pagination.

We're working on a more accurate simulation that will have many features of real Hacker News, like pagination, new-pages and so on.


I'm thinking again about your question. I think you're talking about the time span a submission is visible on the frontpage, right? This can be controlled by the age exponent in the formula and (in our proposal) by the upvote-threshold between new-page and front-page. If only a few very high quality posts make it to the frontpage (because of an upvote threshold), they will have more exposure.


Sounds good to me. I've had very little engagements (good or bad) on things I've submitted in the past that I thought should have been interesting to at least a few people. It seems there is a lot of content falling through that just isn't timed right to get somebody to push the up vote button. I had 1 upvote on a link I submitted last weekend. I suspect nobody actually saw the link or clicked on it and that this was not based on any merit. I've also had links submitted with no action only to see the same link submitted by somebody else making it to the front page much later.

Duplicate links are actually interesting since they are so easy to detect (identical link). Why not simply aggregate metrics for those things? The important thing is the link making it to the front page without creating a lot of duplicates. Somebody submitting a duplicate would effectively become an up vote for the already submitted link. Views and duplicate submissions are possibly more significant signals than up votes.

In search precision and recall are the two metrics people use for judging search quality. It's important to realize that hacker news is effectively a ranking algorithm and therefore is a search engine; even though it delegates actual search to Algolia. It sacrifices recall for precision: everything on the front page should be relatively high quality. But that's at the price of potentially high quality things never reaching it (recall).

Of course, with only 30 slots on the front page, there's only so much that can be on it. Especially if you consider that many users only drop by once or twice a day or so. So those slots stay occupied for quite long as well. Days in some cases. The choice as to what is right is highly subjective (i.e. the moderators decide) and biased towards the intentions of the site and that's intentional. But that doesn't mean it can't be improved.


How about you let HN users choose what algo they want like how they choose their header Color.


One really cool thing about Usenet readers (like slrn) back in the day, was that you could write your own scoring rules for comments to sort and filter conversations.

I will probably go as far as saying that no web based forum has come even close to producing a online group discussion UX as good as what Usenet readers used to have.


Bring your own ranking with an SQL input in the profile! /s


Hmmm, maybe no "/s"? That sounds really cool, especially if users can explore each other's custom feeds. HN admins would get some useful stats on popular feeds. Hackers get to hack...

Would be an interesting experiment, maybe they've tried it before?


Well the /s is for security implications as well as performance.

You can imagine someone writing a clever query to extract hashed passwords or something from different tables. Also with the amount of traffic HN gets the front page is probably cached - running a query for every single request sounds expensive.

It would be very cool if possible tho.


I wonder if one could express a complex ranking algorithm in, say, yaml

Or maybe send the entire list of links of the day to the client and sort it using JavaScript provided by the user


Doing it client-side could actually work.


> The low scores are a bit surprising because all submissions got enough votes to make it to the front-page.

The entire premise of this section is incorrect. There is no such thing as "enough votes to make it the front-page". There's no single threshold, nor is the value of a single vote constant. You're going to have a lot of other effects like specific domains having score penalties, votes being ignored due to them appearing to be fraudulent / non-organic, or the votes arriving too diffuse over a long time period. (Duplicates of the same URL will count as a vote for something like 12 hours after submission, so it's quite easy for a link to get a long tail of votes even after dropping out from the first page of /new.)

I would bet that most of what they claimed made it to the front-page never did. There used to be some HN stat-tracking sites around that had the full ranking history of submissions. Joining against that data would be a lot more credible.

> To achieve this goal, the new-page should expose every submission to a certain amount of views, to estimate its eligibility for the front-page.

That is obviously unworkable. /new is a slush pile, 90% of the submissions do not deserve even a single view based on the title. The proposal ensures that people visiting /new will only see the obvious garbage that nobody else has clicked on either.


You raised some important points.

The threshold of 2 upvotes is a necessary condition to be considered for the front-page (all pages, not just rank 1-30), not a sufficient one. I have this information from dang himself.

Garbage on the new-page will only be viewed, but never upvoted, so it drops quickly because of the view- and age-penalties in the formula and makes room for other submissions. But of course, this needs real-world experiments.


The author's definition of "view" is "click-through". Posts that are obviously not worth clicking through based on the title / domain will get no views, and thus keep getting seeded back into the rotation.


Now I understand. You're right, that's a real disadvantage. We'll take that into account in our coming research. Thank you!


Are 'views' defined anywhere in this writeup? I couldn't quite figure out what the added ranking parameter 'views' exactly is and how it's measured.


I'm sorry that wasn't clear enough. Our current working definition can be described as:

We define a view as a click by a registered user on a submission, so the user sees the submitted content. For Ask HN, the content would be the comments page.

I'm adding it to the article right now.


I guess I don't entirely understand how this works - it sounds like you're introducing a param you don't actually have - you don't know what the clickthroughs are, how they're distributed, etc so this is entirely synthesized. Wouldn't you be able to simulate your way to more or less anything this way?

Seemingly worse for practical purposes, I don't think this data actually exists - adding clickthrough tracking to HN would be a huge change to the privacy profile of the site.


You are right. Our assumption for now was that clickthroughs are distributed like upvotes. But we don't have any data backing this up. That's why we need to do real experiments. And this touches to your second point that even on the real HN site, we don't have this data and don't want to track our users more than we do already. We're not entirely sure about it yet, but we could use an approximation instead. For example how long a story was shown on different ranks.

After all we got some very good feedback from HN and are incorporating it into our research. So thank you for it, this is valuable.


I noticed later submissions often had higher scores. Could this be a function of hn getting more popular over time?


Which data or table are you referring to? "Later" in the list of scores when submitted multiple times (first table)? Note that this is a 30-day window.

Growth of the whole community can be seen in the kaggle notebook: https://www.kaggle.com/felixdietze/hacker-news-score-analysi...

If you're referring to the front-page on HN, it could have been submissions of the second chance pool: https://news.ycombinator.com/item?id=26998308


I have been thinking it would interesting to do the following experiment. Pick two random new posts and show them directly on the home page for a fixed amount of time. Feed the data from those posts (views, votes, content, comments) into some neural network which can magically learn what kind of posts generate most interest.

Showing random, new posts on the home page provides equal viewership to the experiment set while removing the age bias. Many candidate posts will likely be low quality, but we can expect people to filter them out and participate less in them. High quality posts will organically attract people's attention and the algorithm can over time learn factors that differ between lower quality and higher quality posts.


That definition of interest is basically what Reddit does, and in the end I believe the kind of manual curation the HN mods do ends up fostering a healthier and more interesting community. Neural networks fed by votes, views and comments tend to lean to promote outraging content.


Reddit and even HN currently I believe do not account for age bias. Upvotes have a cascading effect and posts which happened to get a few votes early tend to out compete all other posts. What I'm suggesting is to have a small quota on the front page purely for experimentation. These experimental posts do not need to compete with other posts for visibility and the outcome from them can be used to calculate interest. Since these experimental posts are only competing with posts within the experimental set, there is fairness with respect to age and viewership.


AFAIK things like these have been tried, but I can't find dangs comment about it.

Maybe he can report about these experiments himself.


Does the assumption of a homogeneous pool of content and a corresponding homogeneous pool of readers affect this? Segmenting by topic and applying the algorithm per each pool might be necessary if one wants to have a "fair" assessment of quality.


That's a good question. Unfortunately, we don't have any data about the heterogeneity of the user pool. So we have to guess. In the simplified simulation, the agents have random expectation levels and compare that to the submissions quality. They only upvote if the quality exceeds their expectation.

We thought about modeling quality and agent preference with higher dimensional vectors, like in a recommender system. And to see if a user will upvote a specific content, you calculate the dot product. A simplified 1-dimensional user expectation and 1-dimensional quality can be seen as working on the scalars of the dot products itself.

There was a Master thesis which did some simulations with high-dimensional vectors and how many dimensions make sense. The result was that it basically doesn't matter.

In the end, we have to make a real-world experiment with the HN community to see if it works as expected.


Why would that make a difference?


for example the ratio of upvotes to views as a signal of quality may have different average level depending on segment


If you want the staff to see this, email them at the contact link below.


I did. dang said I should post it and let the community discuss about it.


An obligatory link to an article by Evan Miller about ranking: https://www.evanmiller.org/how-not-to-sort-by-average-rating...

However, to get the following requirement from the article:

> The algorithm should not produce false negatives, the community should find all high-quality content.

it might be better to estimate the upper confidence bound, like in Upper Confidence Bound bandit, rather than the lower confidence bound.


One of these days, I am going to write a custom front end.

The biggest problem I have with the front page these days is it gets clogged with junk science and low-effort news posts that produce entirely predictable flame wars that just aren't very interesting. There's about two dozen domains that account for 95% of such stories. I go through and flag/hide them now, and that works okay, but doing it automatically would be better.


Also, the ranking algorithm should take into account the rank of the content when the upvote happened - 10 upvotes for a content on the 5th page is much more impressive than 10 upvotes for a content on the 1st page (keeping everything else constant).


I would love to test-drive this via a webapp. Any plans like it?


That would be great. Maybe there's a hacker news clone we could adapt. In any case, our resources are limited, so any help is appreciated :)

Another option would be an experiment on HN itself for a week or so.


There are situations where some posts come to front page hours after going out from the new page

It has happened to me a couple of times as well. That is clearly unexplainable.


It happened to you = you noticed it? That could he related to the second chance pool, see

https://news.ycombinator.com/item?id=26998308


I typically browse via my rss reader / feed, so even though I’m here daily, I never see the front page (I do upvote on comment pages however).


How do you manage to cope with overwhelming amounts of uninteresting submissions? I tried to follow via RSS but found it generates to much noise, for that to be useful I would require filtering possibilities in my RSS reader and automatic highlighting for the list of terms on different topics. Any suggestion for a more advanced RSS reader?


I think the feed I use is the front page, but I don’t visit the front page (much) via a browswer. When I click a link in my feed reader (Newsify iOS) it brings me to the comments page of the post.

Newsify is great but it’s been crashing a LOT with iOS 15 beta (not complaining, I know it’s beta). It’s a great reader for iOs devices.


That isn't necessary. That's what hnrss.org is for. It will do searches, and filtering by points.


Subscribe to less? You don't say what exactly you did subscribe to, but there are several services that provide a programmatic feed of submissions with X votes, or Y comments.


Although you could use rss.app to filter the feeds as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: