Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Chrome extension to avoid noisy domains in the Hacker News feed (github.com/mathiasrw)
27 points by mathiasrw on Aug 3, 2016 | hide | past | favorite | 16 comments



From the source code:

Noise list generate with data from https://bigquery.cloud.google.com/dataset/bigquery-public-da... per november 2015

All domains with more than 2500 stories with more than 60% of stories having 3 or less votes

Sites with usergenerated content was manually filtered out.

[...]

Sites with usergenerated content:

  - blogspot.com
  - youtube.com
  - wordpress.com
  - tumblr.com
  - google.com
  - wikipedia.org
  - github.com
  - bit.ly
  - typepad.com
  - reddit.com
  - stackoverflow.com
  - quora.com
  - youtu.be
  - stackexchange.com
Result:

  - techcrunch.com
  - nytimes.com
  - arstechnica.com
  - wired.com
  - bbc.co.uk
  - wsj.com
  - businessinsider.com
  - forbes.com
  - cnn.com
  - venturebeat.com
  - mashable.com
  - theverge.com
  - thenextweb.com
  - cnet.com
  - washingtonpost.com
  - theatlantic.com
  - readwriteweb.com
  - gigaom.com	
  - theguardian.com
  - economist.com
  - reuters.com
  - bloomberg.com
  - yahoo.com
  - guardian.co.uk
  - zdnet.com
  - engadget.com
  - slate.com
  - technologyreview.com
  - theregister.co.uk
  - posterous.com
  - bbc.com
  - gizmodo.com
  - npr.org
  - businessweek.com
  - itworld.com
  - fastcompany.com
  - huffingtonpost.com
  - telegraph.co.uk
  - networkworld.com


Hmm, filtering out wikipedia, blogspot and wordpress would eliminate some of the more interesting content.

Surprised medium isn't on here. I'm sort of done with them, signal-to-noise is just not very good.


  > Sites with usergenerated content was manually filtered out.
I read that as Wikipedia, blogspot and wordpress are not being included in "noisy domains" - which you can verify by looking at: https://github.com/mathiasrw/no-noise-hacker-news/blob/maste...

I agree with you about Medium though, I might go through the last several months and see if there were any legitimately interesting articles I read from Medium.

The extension is interesting but ultimately I think too set in stone to be generally interesting. I think it would have broader appeal if each user could edit the filtered domains by default.


The author does say he's aware of this, and PRs are welcome... I like the idea of being able to flag certain domains as well... although I'm rarely looking at newest anyway.


Medium posts are almost invariably opinions. They're like the editorial pages for Hacker News. If you prefer hard news or you think that opinion pieces on Hacker News are susceptible to groupthink, filtering out medium posts makes sense.

OTOH, some people come to hacker news for opinion & debate, so medium posts keep getting voted up.


It's actually fascinating that medium didn't make it to either list. Anecdotally, a lot of new, user-generated content originates from there, and in my mind has overtaken other blogging sides like wordpress, tumblr, etc. as the preferred blog host for newer blogs.


I have a handful of these sites in custom uBlock filters that hide the links to the sites. The sites either require ad blockers be disabled or require a login to use so I'd rather not ever see them. I use a similar list to filter my QuiteRSS feeds. If anyone is interested http://pastebin.com/T1gxeWtj


Hmm, I feel like a lot of these websites are good for the news feed. If something big happens in the news and HN wants to talk about it, I'd rather the featured article be from nytimes or wsj than some random person's blog.

I also think there is a total lack of consensus about what constitutes "noise" on HN or how bad the problem is. I find the political discussions here to be interesting, but I know a lot of people don't (they just want to talk about business and tech). I like talking about "hip" JS frameworks, but a lot of people here aren't web developers or aren't interested in this area of web development. I'm not a super "businessy" person, so a lot of "A merged with B and now their preferred equity is underwater" articles/discussions go over my head. I often find Medium articles thought-provoking, but some of the stuff posted there is garbage, and many HN readers have only been exposed to that side.

Perhaps a better way to attack this problem is to look at the topics commonly posted about on HN and find a way to filter by topic. There are precise/difficult and greedy/easy ways to go about this, but that might be the best way to satisfy people who have different opinions about what is "noise".


> I am talking about the noise from big websites trying to push their content to HN hoping for a bigger audience.

Does this really happen? If so, it's hard to imagine even 10% of the sites listed actually doing this. It doesn't seem like there would be any overlap between "big websites" and sites where traffic from HN would be noticeable.


I am somewhat skeptical the economist is pimping their content on HN hoping for a bigger audience.


We're thought leaders, dammit!


would that be ... uneconomic?


I'm not sure why "noisy" domains rated X would be any worse than other domains with the same rating X.

(rating does not equal # of points; hacker news takes a lot of other factors into account, resulting in a position on the front page and amount of time on the front page)

I'd trust the continuously tweaked hacker news algorithms. We do know that certain domains are penalized.

If you want to avoid the cruft, just use some sort of service that only shows you the best posts on hacker news. hckrnews.com is one that I like, use the filtering in the top right.


I think this is for browsing /newest where noisiness is a more significant factor


Typo and/or non native speaker?

> See src/script.js for more details on how the domains where found.

Probably s/where found/were selected/.


Thanks - fixed...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: