Hacker News new | past | comments | ask | show | jobs | submit login

Date also plays a huge part. Most stackoverflow posts get very little votes in the month they were posted but slowly grow as people search a problem and find an answer. The oldest posts have the highest score so something like "How do I change branch in git" will get 4000 points but posting the same thing or a similar thing today will get you a negative score.

Generally very short questions score badly because they will be stuff like "How do I make a video sharing website" but a very similar question like "How do I copy the current line in vim" will score well as long as its not a duplicate.

Both of those questions look similar when you just look the words and sentence structure. To know the difference you have to understand what a video sharing website is and know what kind of task copying a line in vim is so you can know which one is a reasonable question and which one is likely to help more future readers.




I did try to rule out question age as a major source of error in the model by training on a sample of ~1 million questions all from ~2012. But it didn’t do any better on those.

And (theoretically) duplicates are deleted by the moderators.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: