Hacker News new | past | comments | ask | show | jobs | submit login
13ft – A site similar to 12ft.io but self-hosted (github.com/wasi-master)
618 points by darknavi 83 days ago | hide | past | favorite | 263 comments



Hello everyone, it's the author here. I initially created 13ft as a proof of concept, simply to test whether the idea would work. I never anticipated it would gain this much traction or become as popular as it has. I'm thrilled that so many of you have found it useful, and I'm truly grateful for all the support.

Regarding the limitations of this approach, I'm fully aware that it isn't perfect, and it was never intended to be. It was just a quick experiment to see if the concept was feasible—and it seems that, at least sometimes, it is. Thank you all for the continued support.


Apologies for submitting it here if it caused any sense of being overwhelmed. Hopefully FOSS is supportive here instead of overwhelming.

Thanks for sharing the project with the internet!


Running a server just to set the user agent header to the googlebot one for some requests feels a bit heavyweight.

But perhaps it’s necessary, as it seems Firefox no longer has an about:config option to override the user agent…am I missing it somewhere?

Edit: The about:config option general.useragent.override can be created and will be used for all requests (I just tested). I was confused because that config key doesn’t exist in a fresh install of Firefox. The user agent header string from this repo is: "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"


> set the user agent header to the googlebot one

Also, how effective is this really? Don’t the big news sites check the IP address of the user agents that claim to be GoogleBot?


This. 12ft has never ever worked for me.


I know one website it works well on, so I still use it, but yes, most others fail.


If you would host that server on Google cloud, you would make it a lot harder already.


https://developers.google.com/search/docs/crawling-indexing/...

They provide ways to verify Googlebot IPs specifically, anyone who cares to check wouldn't be fooled by running a fake Googlebot on Googles cloud.

Likewise with Bingbot: https://www.bing.com/webmasters/help/how-to-verify-bingbot-3...


yes, where "cares" means "the lost revenue is greater than the cost of development, QA, and computational/network/storage overhead, and the impact of increased complexity, of a function that figures out whether people are faking their user agent."

It's probably orders of magnitude greater than the revenue loss from the tiny minority of people doing such things, especially given not everyone who uses tools like these will become a subscriber if blocked, so that cuts the "lost" revenue down even further.


Even if it's not worth an actual site operators time to implement such a system themselves, WAFs like Cloudflare could easily check the IP address of clients claiming to be Googlebot/Bingbot and send them to CAPTCHA Hell on the sites behalf if they're lying. That's pretty low hanging fruit for a WAF, I would be surprised if they don't do that.

edit: Indeed I just tried curling cloudflare.com with Googlebots user agent and they immediately gave me the finger (403) on the very first request.


I sincerely hope the antitrust suit ends this practice soon. This is so obviously anticompetitive.


How?

I also think the antitrust suit (and many more) need to happen for more obvious things like buying out competitors. However, how does publishing a list of valid IPs for their web crawlers constitute anticompetitive behavior? Anyone can publish a similar list, and any company can choose to reference those lists.


It allows Google to access data that is denied to competitors. It’s a clear example of Google using its market power to suppress competition.


Hmm, the robots.txt, IP blocking, and user agent blocking are all policies chosen by the web server hosting the data. If web admins choose to block Google competitors, I'm not sure that's on Google. Can you clarify?


A nice example is the recent reddit-google deal which gives google' crawler exclusive access to reddit's data. This just serves to give google a competitive advantage over other search engine.


Well yes, the Reddit-Google deal might be found to violate antitrust. Probably will, because it is so blatantly anticompetitive. But if a publication decides to give special access to search engines so they can enforce their paywall but still be findable by search, I don't think the regulators would worry about that, provided that there's a way for competing search engines to get the same access.


Which is it? Regulators shouldn’t worry, or we need regulations to ensure equal access to the market?


regulators wouldn't worry if all search engines had equal access, even if you didn't because you're not a search engine


And if I had wheels, I would be a car. Theres no equal access without regulation.


Nope. That deal was for AI not search.


This is false, the deal cuts all other search engines off from accessing Reddit. Go to Bing and search for "news site:reddit.com" and filter results by date from the past week - 0 results.

https://www.404media.co/google-is-the-only-search-engine-tha...


Antitrust kicks in exactly in cases like this: using your moat in one market (search) to win another market (AI)


What do you think search is


It's not anticompetitive behavior by Google for a website to restrict their content.

Whether by IP, user account, user agent, whatever


It kind of is. If Google divested search and the new company provided utility style access to that data feed, I would agree with you. Webmasters allow a limited number of crawlers based on who had market share in a specific window of time, which serves to lock in the dominance of a small number of competitors.

It may not be the kind of explicit anticompetitive behavior we normally see, but it needs to be regulated on the same grounds.


Google's action is to declare its identity.

The website operator can do with that identity as they wish.

They could block it, accept it, accept it but only on Tuesday afternoon.

---

"Anticompetitive" would be some action by Google to suppress competitors. Offering identification is not that.


Regardless of whether Google has broken the law, the arrangement is clearly anticompetitive. It is not dissimilar to owning the telephone or power wires 100 years ago. Building operators were not willing to install redundant connections for the same service for each operator, and webmasters are not willing to allow unlimited numbers of crawlers on their sites. If we continue to believe in competitive and robust markets, we can't allow a monopolistic corporation to act as a private regulator of a key service that powers the modern economy.

The law may need more time to catch up, but search indexing will eventually be made a utility.


Google is paying the website to restrict their content.


In a specific case (Reddit) yes.

And that has an argument.

But in the general case no.


Which sites that allows Google to index their content blocks Bing and other search engines (not other bots just scraping for other purposes)?


If you can prove a deal made by Google and the website then you may have a case. Otherwise it is difficult to prove anything.


It's clearly meant to starve out competitors. Why else would they want website operators to definitively "know" if it's a GoogleBot IP, other than so that they can differentiate it and treat it differently.

It's all the guise of feelgood stuff like "make sure it's google, and not some abusive scraper" language. But the end-result is pretty clear. Just because they have a parallel construction of a valid reason why they're doing something, doesn't mean they don't enjoy the convenient benefits it brings.


If this is all it's doing then you could also just use this extension: https://requestly.com/

Create a rule to replace user agent with "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

I just tried it and seems to work.


I tried "User Agent Switcher" since it doesn't require a login. Washingtonpost.com blocked me, and NYT won't load article contents.


It used to be a good extension. Now it is crapware tied to web services. I don't want any web services, I don't want seeing ads about paid features, I want a free extension working absolutely locally and not phoning home.

This piece of crap is, unfortunately, unfit.


I use this extension which has a decent UI: https://webextension.org/listing/useragent-switcher.html



It says it blocks ads and other things, too. I imagine the use case is someone wanting this for multiple devices/people so they don't have to set up an extension for every platform/device individually. I have no idea how effective it is


I always do DevTools -> Network Conditions to set UA, at least in Chrome.


Personally I find it nice for sending articles to friends.


You can make a search related function in FF by rightclicking on that box and 'add keyword for this search' https://i.imgur.com/AkMxqIj.png

and then in your browser just type the letter you assign it to: for example, I have 'i' == the searchbox for IMDB, so I type 'i [movie]' in my url and it brings up the IMDB search of that movie . https://i.imgur.com/dXdwsbA.png

So you can just assign 'a' to that search box and type 'a [URL]' in your address bar and it will submit it to your little thing.


That would mean that your self-hosted install is exposed to the internet. I don't think I want to run a publicly accessible global relay.


Eh, pretty minimal risk unless you use a guessable hostname and/or the URL gets published somewhere.

If the install is under "eixk3.somedomain.com/ladderforfriends" and it sits behind a reverse proxy, it might as well be invisible to the internet, unless your DNS provider is an idiot and allows zone transfers, or you are on a network where someone is snarfing up DNS requests and then distributing that info to third parties. If you restrict it to TLS 1.3, even someone sniffing traffic from one of your friends won't be able to glean anything useful, because the requested hostname is never sent in plaintext.

Rotate the hostname if/when it becomes necessary...


Your certificate will however show up in public certificate transparency lists.

You could mitigate that with a wildcard cert, but still..


As soon as you get a cert for that domain, you will start getting requests to it because of certificate transparency reports. Everyone will know immediately the site exists.


Those are all very plausible


It seems to me that google should not allow a site to serve different content to their bot than they serve to their users. If the content is unavailable to me, it should not be in the search results.

It obviously doesn't seem that way to Google, or to the sites providing the content.

They are doing what works for them without ethical constraints (Google definitely, many content providers, eg NYT). Is it fair game to do what works for you (eg. 13ft)?!


> It seems to me that google should not allow a site to serve different content to their bot than they serve to their users.

That would be the fair thing to do and was Google's policy for many years, and still is for all I know. But modern Google stopped caring about fairness and similar concerns many years ago.


The policy was that if a user lands on the page from the Google search results page, then they should be shown the full content, same as Googlebot (“First Click Free”). But that policy was abandoned in 2017:

https://www.theguardian.com/technology/2017/oct/02/google-to...


This is called cloaking[0] and is against Google's policies for many years. But they don't care

[0] https://en.wikipedia.org/wiki/Cloaking


"organizing the world's information"


I disagree, I think the current approach actually makes for a better and more open web long term. The fact is that either you pay for content, or the content has to pay for itself (which means it's either sponsored content or full of ads). Real journalism costs money, there's no way around that. So we're left with a few options:

Option a) NYT and other news sites makes their news open to everyone without paywall. To finance itself it will become full of ads and disgusting.

Option b) NYT and other news sites become fully walled gardens, letting no-one in (including Google bots). It won't be indexed by Google and search sites, we won't be able to find its context freely. It's like a discord site or facebook groups: there's a treasure trove of information out there, but you won't be able to find it when you need it.

Option c): NYT and other news sites let Google and search sites index their content, but asks the user to pay to access it.


> Real journalism costs money, there's no way around that

I agree, but journals should allow paying for reading the article X amount of money, where X is much much much lower than the usual amount Y they charge for a subscription. Example: X could be 0.10 USD, while Y is usually around 5-20USD.

And in this day and age there are ways to make this kind of micropayments work, example: lightning. Example of a website built around this idea: http://stacker.news


PS: I actually meant to post yalls, not stackerNews, see: https://yalls.org/


I don't think this will work reliably as other commenters pointed out. A better solution could be to pass the URL through an archiver, such as archive.today:

https://archive.is/20240719082825/https://www.nytimes.com/20...


How do they do it?


Missed opportunity to call it 2ft, as in standing on one's own.


I kind of like the implied concept that self-hosting is the last-foot problem.


...or 11ft8, which can open anything


  https://en.wikipedia.org/wiki/11foot8


I like how google shows "How tall is the 11-foot-8 bridge?" "12 feet 4 inches"

(because maintenance work was done on it, but retaining its name)


666ft....


has 12ft.io even been working anymore? I feel like the only reliable way now is archive.is


I just had anecdotal success with it last week and the atlantic, but before that it has been very hit and miss.


I'm using the [Bypass Paywalls](https://github.com/iamadamdev/bypass-paywalls-chrome/blob/ma...) extension but it looks like that's been DMCA-ed in the interim.



Which directs you to https://gitflic.ru/project/magnolia1234/bpc_uploads which is not ideal...


I agree. Though there is a counterpoint that a Russian host isn't going to respect a DMCA request. On the flipside it's a Russian replacement for Github that is based on Gogs, Gitea, or even Forgejo possibly. So yeah, YMMV.


Very rarely works.


Yeah I also have been using archive.today as well, since 12ft hasn't worked on NYT in forever


Nice effort, but after one successful NYT session, it fails and treats the access as though it were an end user. But don't take my word for it : try it. One access, succeeds. Two or more ... fails.

The reason is the staff at the NYT appear to be very well versed in the technical tricks people use to gain access.


They probably asynchronously verify that the IP address actually belongs to googlebot, then ban the IP when it fails.

Synchronously verifying it, would probably be too slow.

You can verify googlebot authenticity by doing a reverse dns lookup, then checking that reverse dns name resolves correctly to the expected IP address[0].

[0]: https://developers.google.com/search/docs/crawling-indexing/...


> Synchronously verifying it, would probably be too slow.

Why would it be slow? There is a JSON documenbt that lists all IP ranges on the same page you linked to:

https://developers.google.com/static/search/apis/ipranges/go...


Sure, that's one option, and I don't have any insight into what nyt actually does with regards to it's handling of googlebot traffic.

But if I were implementing filtering, I might prefer a solution that doesn't require keeping a whitelist up to date.


The whitelist can be updated asynchronously


Maybe we could use GCP infra and the trick will work better ?


Which leads to the possibility of triggering a self-inflicted DoS. I am behind a CGNAT right now. You reckon that if I set myself to Googlebot and loaded NYT, they'd ban the entire o2 mobile network in Germany? (or possibly shared infrastructure with Deutsche Telekom - not sure)

Not to mention the possibility of just filling up the banned IP table.


Hypothetically if they were doing that, they’d only be ‘banning’ that mobile network in the ‘paywall-relaxing-for-Googlebot’ code - not banning the IP traffic or serving a 403 or anything. They ordinarily throw paywalls at those users anyway.


There are easily installable databases of IP block info, super easy to do it synchronously, especially if it’s stored in memory. I run a small group of servers that each have to do it thousands of times per second.


Some sites do work, but others such as WSJ just give a blank page. Worse, Economist actively blocks this through Cloudflare.


We need a P2P swarm for content. Just like Bittorrent back in the day. Pop in your favorite news article (or paraphrase it yourself), and everyone gets it.

With recommender systems, attention graph modeling, etc. it'd probably be a perfect information ingestion and curation engine. And nobody else could modify the algorithm on my behalf.


I get the feeling there is some sort of vector of attack behind this idea, but I'm not well versed enough to figure it out.


I think you have something here.

I don’t know about paraphrased versions but it would need to handle content revisions by the publisher somehow.


"The reason is the staff at the NYT appear to be very well versed in the technical tricks people use to gain access."

It appears anyone can read any new NYT article in the Internet Archive. I use a text-only browser. I am not using Javascript. I do not send a User-Agent header. Don't take my word for it. Here is an example:

https://web.archive.org/web/20240603124838if_/https://www.ny...

If I am not mistaken the NYT recently had their entire private Github repository, a very large one, made public without their consent. This despite the staff at the NYT being "well-versed" in whatever it is the HN commenter thinks they are well versed in.


https://securityboulevard.com/2024/08/the-secrets-of-the-new...

Because I had to learn more, sounds like a pretty bad breach. But I’m still pretty impressed by NYTs technical staff for the most part for the things they do accomplish, like the interactive web design of some very complicated data visualizations.


Wasn't the creator/maintainer of svelte on the NYT data visualisation/web technology team?


Rich Harris, creator of Svelte, worked at the Guardian. Svelte has been adopted at NYT however [0].

You might be thinking of Mike Bostock, the creator of D3.js and ObservableHQ.com, who led data visualization work at NYT for several years [1]. I'm not sure if they have people of that magnitude working for them now.

[0] https://en.wikipedia.org/wiki/Svelte

[1] https://en.wikipedia.org/wiki/Mike_Bostock


Rich did work at the NYT. I thought there was some Mandela effect going on for a second, because you misled me into believing you had actually googled it by providing sources.


Yeah, my bad. I shouldn't have relied solely on the Wikipedia article and my (sketchy) memory. Rich Harris is still listed as a graphics editor on the investigative team at NYT: https://www.nytimes.com/by/rich-harris


Might want to update the Wiki article on Svelte, which strongly implies Rich worked at The Guardian, not NYT. The only source I could quickly find that seems to corroborate what you're saying is a LinkedIn page, but because of its bullshit paywall, there's context missing.


You may be thinking of https://bost.ocks.org/mike/ who worked there for a while, D3 and Observable, among many other things.


I would hope that they’re in IA on purpose. It would be exceptionally lame if NYT didn’t let their articles be archived by the Internet’s best and most definitive archive. It would be scary to me if they had only been allowing IA archiving because they were too stupid to know about it.


> I use a text-only browser.

Which browser is this?


-4


It seems parent is more interested in riddling than informing. Their browser is likely https://en.m.wikipedia.org/wiki/Links_(web_browser)


When it comes to fruitful discussions that leaves one with satisfaction and contentment, this ain't it. This is the polar opposite. Cheers


You could have just said the name in less words facepalm


Perhaps they're a regular contributor to somewhere that discusses 'the browser they use' and don't want any association with it via HN.


FWIW, if you happen to be based in the U.S., you might find that your local public library provides 3-day NYT subscriptions free of charge, which whilst annoying is probably easier than fighting the paywall. Of course this only applies to the NYT.


In the Netherlands the library provides free access to thousands of newspapers for 5 days after visiting, including The Economist and WSJ, which actually have paywalls that aren't trivial to bypass.

https://www.pressreader.com/


I just checked and the Berlin public library offers press reader as well. Will need to check that out.

Thanks for the tip!


How do you bypass the paywall with it? I can only read the “newspaper version” on their site it seems.

Edit: Just read your comment again. I assume that’s exactly what you meant.


PressReader allows reading various newspapers (in their newspaper form), the short sub to NYT I mentioned is a bit different and gives you access to the online version.

Example for LA libraries: https://www.lapl.org/new-york-times-digital

Your local library might have a similar offering specifically for the NYT.


i tried to access my own website and it says internal server error. i also tried to access Youtube and it said the same.


This is awesome! People who use Kagi can also set up a regex redirect to automatically use this for problematic sites.


I continue my search for a pay wall remover that will work with The Information. I'm honestly impressed that I've never been able to read an Information article in full.


They probably never make their full articles available without a login. Google Search can probably index them well enough with just the excerpt.

Simple as that.


There's definitely a hit to search traffic if you go this route, as Google is unlikely to rank you above a competing article for a competitive keyword based on only an excerpt. The Information simply doesn't care.

They have an expensive subscription ($400/year) that I'd guess targets VCs and tech execs, which is a very specific segment that's best reached via means other than Google search, anyway.

But yes, to your point, successfully paywalling a media site in a way that's impossible to bypass is trivially easy to do. Most media sites just don't think it's worth it.


Why not just use uBlock Origin for the aspect of cleaning up the popups / ads and such?


Or noscript browser plugin if you want to completely stop js

https://noscript.net/


uBO is fine for clearing up crud after you've accessed an article.

It's not useful for piercing paywalls / registration walls in the first place.

It's also not universally available, particularly on mobile devices and/or Google Chrome.


> It's also not universally available, particularly on mobile devices and/or Google Chrome.

FYI it is available on Firefox for Android.


Yep, and I use it there...

... though my preferred Android browser these days is Einkbro, optimised for an e-ink device as the name suggests, that being what I drive.

(Firefox on e-ink is ... beyond annoying.)

The browser itself has pretty good ad-blocking capabilities, but I'd rather much like to be able to block specific elements as well.


I would run headless chrome to fetch website and block all fetch requests from ublock origin list. This would give you a "remote" adblocker that works anywhere.

... but at that point just install pihole.


Because I was stupid and bought an iPhone which means that is not possible.


AdGuard works on ios, so does 1blocker. They’re no uBlock, but they do the trick.


FYI: the iOS Orion browser supports ublock origin


How is its addon support now though, in general? I stopped using it last year since it was pretty slow on iOS but more so because its addon support was very wonky.


Do iPhones not have Firefox now?


Quoting _khhm who posted here: https://news.ycombinator.com/item?id=25850091

> On iOS there are no web browsers other than Safari, per the app store rules. "Chrome" / "Firefox" / etc on iOS are just basically skins on top of Webkit.

> See 2.5.6 here - https://developer.apple.com/app-store/review/guidelines/

> This is why you don't get any of the features / extensions / etc of Chrome or Firefox on iOS.


With iOS 17.4 they allow other rendering engines for users in the EU.

https://www.apple.com/newsroom/2024/01/apple-announces-chang...


> per the app store rules

Jeez.


I found this when looking for fun self hosted apps. It's pretty bare bones but does seem to work well with articles I've found so far.


12ft.io doesn't really work anymore.

If you're on iOS + Safari I recommend the "open in internet archive" shortcut, which is actually able to bypass most paywalls.

https://www.reddit.com/r/shortcuts/comments/12fbk8m/ive_crea...


I’m more inclined to use archive(.org|.ph). But this is a decent workaround when archive is unavailable.

Side note: paywalls are annoying but most publications are often available for free via public library.

For example, NYT is free via my public library. PL offers 3-day subs. A few other decent publications are available as well. Availability of publications is YMMV as well.


> most publications are often available for free via public library.

Via public library in the USA. Other countries exist and as far as I’ve gathered aren’t typically given this level of free access.


Works in France too, as probably some other europeans countries. This is not widely advertised though.


> Works in France too

Could you provide some links? How can one access, say, The New York Times, or The New Yorker, or the Wallpaper magazine with a French library card?


you can ask a NYTimes 72 hours pass anytime from your bnf account.

https://www.bnf.fr/fr/ressources-electroniques-de-presse

It is obviously clumsy on purpose in the sense that if you want to access the NYT on a regular basis, you need to go through the procedure again once the 72h pass expires. If you are a regular reader it might be worth paying for the membership.


Hence “ymmv” (your mileage may vary) ;)


Ironically this sentence origin is in the US context. And the abbreviation is mostly used in American English slang [1].

Please don't take it as attack or even criticism, I just found it funny observation. That might be wrong

[1] https://en.wiktionary.org/wiki/your_mileage_may_vary


NYT's onion site is also free.


Onion site? As in, there’s a mirror on tor?

(Edit) TIL: https://open.nytimes.com/https-open-nytimes-com-the-new-york...


This could be used as a proxy to web interfaces on the same local network couldn't it?

There are probably much better and more secure options, but this might be an interesting temporary kludge.


> Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port.

An instruction on how to specify port would be nice.


The docker-compose.yaml file is where you specify the ports you want to expose. It looks like by default it's 5000:5000 (5000 outside and inside the container). You will need to change it and then run docker-compose up -d.

You can change it to something like 5133:5000 and access the instance through localhost:5133


Thank you for the tip! I ended up editing the port parameter in the app.run() call within portable.py and it worked. Felt like it might be a good idea to add this as a runtime argument for easier customization.


I am not familiar with 12ft.io, I wanted to try it out, but I get "Internal Server Error" when trying to visit a website.


Bypass Paywalls Clean has moved here btw, https://github.com/bpc-clone?tab=repositories


Yes, but it may be taken down via DMCA soon. See this DMCA request: https://github.com/github/dmca/blob/master/2024/08/2024-08-0...

It mentions bpc_updates in the takedown request....


It moved to a Russian equivalent of GitHub or Gitlab (because of where its hosted, YMMV): https://gitflic.ru/project/magnolia1234/bpc_uploads


That's not the URL posted on magnolia1234's Twitter, but it may be a mirror. Caveat emptor.

I would watch here in case the primary goes down. https://x.com/Magnolia1234B


You're wrong, it's actually the exact URL posted on twitter:

https://x.com/Magnolia1234B/status/1823073077867287028


Yep it's on their Twitter account, and linked in multiple places on their GitHub repos....


BPC also has the option to spoof the User-agent as well when using the "custom sites" option:

• set useragent to Googlebot, Bingbot, Facebookbot or custom

• set referer (to Facebook, Google, Twitter or custom; ignored when Googlebot is set)


I'm surprised this works at all.

What sysadmin is so naive to rely on the Googlebot useragent??

Doesn't everyone fetch the official list of Googlebot IPs and then add those to a whitelist?


Does it help when pretending to the google bot to be running on an IP from inside the Google Cloud?



It's always seemed easier to me to use FF + Ublock + Bypass Paywalls. Never fails.


>> This is a simple self hosted server that has a simple but powerful interface to block ads, paywalls, and other nonsense. Specially for sites like medium, new york times which have paid articles that you normally cannot read. Now I do want you to support the creators you benefit from but if you just wanna see one single article and move on with your day then this might be helpful

Personally I'm not a fan of this attitude. I've read and digested the arguments for it, but, for me, it runs close to "theft".

For example, read the sentence again, but in the context of a restaurant. Sure I wanna support the creators, but what if I just want a single meal and then get on with my day?

Businesses, including news web sites, need to monetize their content. There are a variety of ways they do that.

You are free to consume their content or not. You either accept their monetization method as desirable or you do not.

The "I just want to read one article" argument doesn't fly. If the article is so compelling, then follow their rules for accessing it.

Yes, some sites behave badly. So stop visiting them. There is lots of free content on the web that is well presented and lacks corporate malfeasance. Read some of that instead.

I get that I'm gonna get downvoted to oblivion with this post. HN readers are in love with ad blockers and paywall bypasses. But just because you can do something, just because you think it should be "free, no ads", does not make it right.

Creators create. They get to choose how the world sees their creation. Support it, don't support it, that's up to you. Deciding to just take it anyway, on your terms (however righteous you feel you are) is not ok.


This uses a couple of classic fallacies:

1. Stop comparing digital products (data) to physical products (food).

2. Don't use the word "take", nothing is taken, only copied.

3. "They get to choose how the world sees their creation" Not necessarily. This is a pretty big assumption that lies at the heart of the conflict between the rights of the author and the rights of the public.


> 1. Stop comparing digital products (data) to physical products (food).

That's not a fallacy. Perhaps the way they are compared is wrong, but they can be compared. If you want an example: e-books vs books, mp3s vs cds, Netflix vs DVDs, online banking vs your local branch office.

> 2. Don't use the word "take", nothing is taken, only copied.

You do take it, except when you intend take to mean "remove from a place." You can take a nap, you can take a breath, etc.

> 3. ... This is a pretty big assumption

Copyright owners do have the right to restrict access, legally and morally, although the latter is IMO, of course.


I agree with you. Where the gap lies for me is that I can’t just buy one meal, I have to sign up for the yearly meal plan even if I just want one meal.

A few years ago I tracked how many times I visited some of the various paywalls sites for their articles, and typically it was between 5-10 times per year. One was 30, so I paid for a subscription to them, but I can’t justify several dozens of dollars for 5 articles on many other sites. If I can’t access their content because a bypass doesn’t work then so be it, however I wasn’t willing to pay for that content either. I feel like it’s the classic misconception regarding piracy by the movie industry - I wasn’t willing to pay money for it in the first place, so it’s not lost revenue (unfortunately).

I was actually discussing this overall problem with my wife the other day, and I came to the conclusion that I basically want a “Netflix” but for news - I subscribe to one place and I get access to a whole range of publications. That’s worth it to me. I very much don’t see it happening though, sadly.


I'll counter your one meal vs subscription analogy with another;

I don't want to buy this Ferrari, I just want to drive it for the day. The dealership wasn't interested (they directed me to a different business with a different business model.)

Yes you want a Netflix for news. But even Netflix isn't enough. You also need Amazon, Disney+, Apple TV and so on.

Indeed, all of them are only partially aggregating - much of throw content (if not all of it) is in-house production.

Yes micro payments per article would be nice, but endless startups have proved this approach works for neither suppliers nor consumers.

There's no place to rent a Ferrari in my town. That doesn't make it ok to just "borrow" the dealers one.

The world does not have to magically conform to our desires, our ethics. Sometimes we can't get what we want, and just taking it anyway is not ok


Supercar hire businesses do exist though, and I can certainly rent one for a day in many places all over the world.

Regarding Netflix - I’m referring to OG Netflix which really did seem to aggregate everything under one subscription.

In any case, I do agree that micro transactions for articles mostly do fail, hence my leaning towards a more “Netflix”-style approach that lowers the risk for consumers. I don’t expect to get what I want here, but publishers also can’t simply get what they want either.


Yes super-car rentals exist, but only in a small number of locations. My point was that not having one conveniently available doesn't make alternative approaches ok.

An aggregator like the original Netflix would be nice but I suspect that model would not work for long. (As evidenced by current Netflix et al).

Publishers can certainly do anything they like with their content, and they set the rules for accessing it.

Assuming what they want is piles of money, I expect they take that into account when setting the rules.

But it's their content. You don't get to break the rules just because you don't like them.


Why doesn't someone get to break the rules just because they don't like them? This principle isn't backed up by empirical observations.


Really someone just needs to spotify this shit. Scrape everything and hand out pennies based on who attracted what eyesballs.


The problem is that this might end with a bad incentive structure that I think would lead to bad quality: it would push you to write the kind of articles that everyone wants to click on/pay for. So mostly clickbait. Emotional content instead of factual one. It's unlikely that this could finance the long-term, high-quality investigative journalism that actually defines high quality journals.


Newspapers have been dealing with this issue since the nineteenth century. I don't know how things work where you are in the world, but in the UK and Australia, newspapers are separated into broadsheets which have better-quality journalism and tabloids which are clickbait nonsense. (The terms come from the paper sizes they used to be printed on.) In the UK, the tabloids are further divided by the colours of their mastheads: the black-tops are less sensational; the red-tops are more sensational.


Seems like thats how the market is currently structured. Most of those paywalled sites are already dishing out endless streams of emotive bs.

What I want, is the ability to not pay a subscription that includes the emotive bs, but to read only the content I want.


It is the opinion of every musical artist that even extremely popular artists on Spotify make enough to buy a few boxes of graham crackers.


Yes. My understanding is that Spotify is reliant on a system used for radio broadcasts. Paying pennies to broadcast songs was already the structure. Artists make their money from merch and live gigs as it always has been. But now they can also target me for advertising based on the bands I listen to, to sell me merch and live gigs.

Back to newspapers. Traditionally if a news paper article takes my fancy I pay a small fee to access that paper and not establish a long term subscription for just that paper. A spotify type service could easily hand out credits in the same way.


> They get to choose how the world sees their creation

Then they should not publicize it. They could license only to Google, but Google isn't interested. Instead, publishers need to publicize, which is... expected? Once they publicize, they can't claim the public is not allowed to read. It's like sharing a printed newspaper with my friends. Publishers shouldn't be able to prevent it.


Im not sure I buy this arguments. Lots of things are publicized and then need to be paid for.

Like movies. Or music. Or books etc.


They choose to make the content available to parties that I can impersonate. I respect their decision by impersonating them.


You could likely swindle many physical stores out of wares by social engineering too, but making material gains by deception is known as "fraud"


Cool. I could also build a rocket and go to the moon but thats not really the topic here.


Yes. It's funny how people will claim they only block ads because they allegedly want to pay for good content, but cannot, or claim that piracy is just a service problem. Yet when asked to put their money where their mouth is they instead just continue to openly try and get stuff for free. It's pure entitlement and disrespect for other's labor

As for the "I can; therefore I will" justifications: I can steal from my local corner shop. It's very unlikely they'd catch me. Yet I do not


Now if someone could just package this into a browser extension it would be great!


The next step being 11ft 8 inches.[1]

[1] http://11foot8.com/


Reminds me of Montague Street Bridge. [1]

[1] https://howmanydayssincemontaguestreetbridgehasbeenhit.com/


14ft. Similar to 13ft but implemented in Rust.


Or 4.2672m, for the rest of the world. :)


> The next step

Should that read "Previous"? :)


I'll gladly pay for journalist content, but not when a single article is going to be $15/mo and hard to cancel.

Is there some way to support journalism across publications?


I just came from chicagotribune.com where they tried to entice me with a Flash Sale of one year’s access for a total of $1. Sounds great, but I took advantage of it a year or so back and regretted it due to how annoying they were with advertisements, newsletters, etc…. It’s pretty amazing that the tactics can be so annoying that they can make me regret a $1 purchase.


That's the topic of an essay I'd written a couple of years ago and just discussed on HN last week:

"Why won't (some) people pay for the news?"

<https://diaspora.glasswings.com/posts/867c94d0ba87013aca4144...>

<https://news.ycombinator.com/item?id=41249571>

Upshot: Numerous legitimate reasons. I'd very much like to see universal syndication / superbundling on a distributed progressive tax and/or ISP tollkeeper basis, with some additional details for the fiddly bits.

As for subscribing to content submitted to general discussion sites, such as HN or others:

As of 21 June 2023, there were 52,642 distinct sites submitted to the HN front page.

Counting those with 100 or more appearances, that falls to 149.

Doing a manual classification of news sites, there are 146.

Even at a modest annual subscription rate of $50/year ($1/week per source), that's a $7,300 subscriptions budget just to be able to discuss what's appearing on Hacker News from mainstream news sources.

Oh, and if you want per-article access at, say, $0.50 per article, that's $5,475 to read a year's worth of HN front-page submissions (10,950 articles/year), and that is just based on what is captured on the archive. In practice far more articles will appear, if only briefly, on the front page each day.

Which is among the reasons I find the "just subscribe" argument untenable. Some sort of bundling payment arrangement is required.

<https://news.ycombinator.com/item?id=36832354>

(Source: own work based on scraping the HN front-page archive from initiation through June 2023.)


https://www.pressreader.com has 7000 publications for $30 a month.

I get it free by visiting my library once a week.


Oddly enough, we created this thing called the Internet and World Wide Web on the premise that data comes to people and not the other way 'round.


>ISP tollkeeper

ISPs are terrifying organisations I wouldnt wish this system on the world.

Everything else you said makes sense.


I'm not suggesting this out of any sense that they are good-faith participants in society or even commercial space.

They are however the one point of authenticated and payment-based contact between all Internet users and the greater cloud.

So if there's going to be some form of payment, it's highly likely to come either from the ISP or some collection of government-imposed fees or taxes. The latter could come at multiple levels, e.g., a city, county, state, or federal levy.


>They are however the one point of authenticated and payment-based contact between all Internet users and the greater cloud.

Yeah but we need to diminish their contact with authentication and payments. They have little to no understanding of security.

Like you might be thinking "Oh just have comcast do this" and they could probably cobble something together with only a massive data breach once in every 5 years.

But think about all the guys in the sticks dealing with Joe Blogs Wisporium or Texas Petes flying internoobs. These solo operators often cant work out telephony or iptv. Having them middleman payments for an extra tax or service is a huge extra risk. I am not even super comfortable with them holding PII or Credity cards as it stands.


>ISPs are terrifying organisations

Could you explain why?


Mostly because they see themselves as invisible, have basically no security posture, and are often either operated by the owner, or by a bunch of foreign tech support goons who have unlimited access to your data.


At least in the US, most:

1. Are monopolies.

2. Don't care if you live or die.

3. Have a long sordid history of despicable practices, including price gouging, rampant upselling, shoddy customer service, manipulating search and Web results, selling customer browsing history or patterns, and worse.

4. Are typically at or near the top of lists of companies / industry sectors with which the public trusts least and hates most.

They have enormous power over individuals, collect absolute gobs of data, and act capriciously, carelessly, and overwhelmingly against their subscribers' personal interests with it.


This attitude is why journalism is dying. There is value to an undissected payment to the publisher that gives them revenue surety and lets them fund a variety of reporting, even if you don't personally find it interesting or relevant. This is exactly how journalism x publishing worked with the "rivers of gold" from classifieds/advertising during the golden age (also: this is exactly how Bell Labs, Google 20% time, etc were/are funded. The incessant need to pay for only the small portion you directly consume/find interesting kills this sort of thing).


Interesting thoughts. I can’t refute or support your assertions about the cause of journalism’s demise off hand, but I actually am very curious whether a publication could find success in a model where one could pay a nominal fee to access a give article (in addition to the normal subscription setup).

I don’t pay for NYT. I don’t want to, because of the cancellation stories I see repeated.

If I could pay $1 here and there to access an article, though? I’d do that.

And NYT would get some money from me they aren’t getting now.

Seems, maybe, worth it. I don’t know.


I see the point you're making, but I'm not sure it's a fair assessment that my attitude is why journalism is dying. I'd almost go so far as to say we're making the same point.

See, back in the "good olde days", I could subscribe to 1 or 2 sources of news (in my case the local paper and the big city paper) and get something like 80-90% of my general news. I guess largely through the magic of syndication. When someone shared a story with me, it was physically clipped, no paywall. And I get the impression that advertising was fairly effective.

The attitude that is killing journalism is, IMHO, the publishers attitude that the world still operates the same way it did 40 years ago: buy a subscription to us and advertisements work.

One of the big reasons I don't subscribe to, say NYT, is that in a given month there are only a few articles there that I seem to be reading. There are maybe 5-7 sources like that, and, when I'm honest with myself, my life isn't enriched by $100/mo subscribing to them. And advertisements just don't seem to work in today's world.


For example: I do pay for The Guardian because they:

- Don't paywall their articles. - They pop up a reminder that I've looked at X articles. - I can pay what I want (I probably average paying them around $1 or 2 per article I read).


While this is generally true for legacy publications (impossible to cancel!), I mostly enjoy paying for niche-topic newsletters from a single source. A great example is a former newspaper journalist who was laid off and now produces his own newsletter focused on a single college football team. He probably makes more now than when a newspaper employee. I am a happy subscriber. I pay for a handful of these. I also subscribe to newsletters like "Morning Brew." while free and ad-supported, it is well done.


Wish you could just have the option to pay 50 cents to unlock a single article.


> Is there some way to support journalism across publications?

Apple News+, Inkl, PressReader, (maybe more). Others if you want more magazine-focused subscriptions.


I liked Apple News for a bit, but the more I used it, the more it felt like an algorithmic echo chamber like all other social media.


And of course for apple Europe still doesn’t exist because they don’t even sell their service here in Italy :/


Is blendle an option for you?


A decade or 2 ago, there were some talks in several countries about creating a global licensing aggreement where people would just pay one single tax payment per year and have access to everything without being called pirates.

But medias / arts publishers weren't happy with that idea.


Articles should come with license agreements, just like open source software nowadays. Free for personal entertainment, but if you try to make money from the information in the article or otherwise commercialize it, you can fuck right off.


> Free for personal entertainment

Didn't the GP say

> Is there some way to support journalism across publications?

I don't think there's a way to support without paying.


> Free for personal entertainment, but if you try to make money from the information in the article or otherwise commercialize it, you can fuck right off.

Note that such a license would not be considered open source. Open source and free software allows commercialization because they do not allow discrimination against various interest groups. The only thing that open source allows you to do is to restrict people from restricting the information, which has some relation to commercialization, but not fully.


It once was Google's requirement that you'd serve the same content to the Google crawler as to any other user. No surprise that Google is full of shit these days.


"organizing the world's information" or 'maximizing revenue?' I don't know - somehow either argument justifies this


Counterpoint - if you like the content enough to go through this - just pay for it. Monetary support of journalism or content you like is a great way to encourage more of it.


Countercounterpoint - Maybe I have news subscriptions for periodicals I regularly read, but don't feel like paying for a monthly subscription to read one random article from some news outlet I don't regularly read that someone linked on social media or HN.


So back out of the webpage and don't read it. That is a constructive way of letting a content producer know their user experience is not worth the "expense" of consuming their product. But if the content is worth your time and energy to consume, pay the "price" of admission.


I back out of the webpage and go to 12ft.io, which allows me to both, read the article, while simultaneously using that constructive way of letting the publisher know that their product is not worth it's price.


And then 12ft-dot-io throws an error, but still shows its own ad in the bottom right corner! But you probably knew that since you constructively use them.


The three articles I read from the NYT a year are not worth the price of a monthly subscription.

My choices are:

1) Use archive.ph to read the three articles.

2) Never read a NYT article again.

3) Pay for a subscription for the NYT.

I think you need to be approaching this from an exceptionally legalistic perspective to think that anything but Option 1 is reasonable. If I could pay the five cents of value those three articles are worth, I would, but I can't so I won't.

Standing at an empty intersection, I'm not going to start lecturing someone for looking both ways and crossing the street when the traffic light signals "Don't Walk".

I understand that you might feel that journalism is under funded and that this scofflaw, naredowell attitude is further jeopardizing it. The fact that the reasons newspapers are failing is complex and has less to do with consumer behaviour than it does with other factors not least of which are market consolidation and lax antitrust laws. I pay hundreds of dollars a year on newspaper subscriptions and I refuse to believe that I'm the reason any of that is happening.


I guess we are going down a rabbit hole that 12ft-dot-io doesn't specifically address — it doesn't bypass paywalls. Regardless, #2 is an option. And the choice is entirely yours.

I get more peeved at the entitlement many feel to use ad blockers and rail against content producers monetizing their sites, when the choice to not consume the content is an option. Ask my why I gave up twitter a few weeks ago :)


> 12ft-dot-io doesn't specifically address — it doesn't bypass paywalls.

13ft does, I just tested it on https://www.nytimes.com/2024/08/19/us/politics/hillary-clint...

> Regardless, #2 is an option. And the choice is entirely yours.

I can also choose not to read over the shoulder of someone reading an article on the train or averting my eyes at the headlines displayed at a newsstand. Somehow, I can't find in me the slavish devotion to the media industry margins required to do so.

> I get more peeved at the entitlement many feel to use ad blockers and rail against content producers monetizing their sites, when the choice to not consume the content is an option.

This is such a confusing opinion, and an even more baffling to thrust it unto others.

The best thing to do for ones computer safety is to run an ad blocker, as acknowledged by even the FBI[0]. Profiling by ad companies makes our world more insecure and inequitable. I deeply despite selling client data as a business model, as it seems you might as well.

So, your position is that I should both lodge my complaint against their unfair dealings by not consuming their website, but that it is also unjust for me to evade tracking and block ads because it hurts their bottom-line which is unethical to begin with . This sorta feels like chastising me for walking out of the room while TV ads run and deigning to watch the rest of the programme.

[0] https://techcrunch.com/2022/12/22/fbi-ad-blocker/


It’s baffling to me why you would insist on consuming content produced by such dangerous abusers of your security and privacy. And then thrusting your opinion that all content should be free onto all sites monetized by ads is further confusing.


> It’s baffling to me why you would insist on consuming content produced by such dangerous abusers of your security and privacy.

Because I'm not an ascetic monk.

> And then thrusting your opinion that all content should be free onto all sites monetized by ads is further confusing.

I'm not telling you to install an ad blocker. I'm just telling you I am.


> Because I'm not an ascetic monk.

That’s glib. It is possible to discern websites that are safe, respect privacy and are generally pleasing to visit without an ad blocker. If you deem them unsafe, leave, don’t log entirely off the internet.


I’m not saying you are telling me to. I’m pointing out that you are depriving sites from their chosen method of monetization while continuing to consume their content. Effectively “averting your eyes” from their ads, instead of just not visiting the site.

I’m not accusing you of anything. It’s just simply what you are doing. It’s the mental gymnastics these threads are always full of justifying the wholesale disavowal of all ad-supported content that is hard to follow.


This assumes the their presence has no affect on me. It takes time to click a page and let it load, and more time to dig through all of the results when all of them are unreadable. Maybe if there were a tag like [ungodlyamountofads] on each, it would help. But even then I'd still have to scroll through them.


I guess I fail to see how one can entirely remove how fully voluntary the visiting of a webpage is. It is how the web works! And how all kinds of "free" media has worked for eons.

I don't mean to excuse incredibly poor user experience design, and certainly not abusive tactics. But sorry if I have zero empathy for your clicking, loading and scrolling pain. Leave the website! It is amazing how many people are defending a site that claims to "Remove popups, banners, and ads" while: 1 - failing to even work. and: 2 - shows it's an ad on the resulting page!


>But sorry if I have zero empathy for your clicking, loading and scrolling pain.

Ok, so we just fundamentally disagree.


No doubt.

While we likely agree there are egregious abusers of both user experience and privacy, I don't believe I have a fundamental right to define how a website is allowed to present their content and/or monetize it. But I do retain the right, which I frequently practice, to leave a webpage and utilize alternate sources in that moment and in the future.


Majority of the internet is your "leave the webpage" example so by allowing shady ad tech sites to use these tactics you're just promoting the proliferation of a shittier internet. Being subjective in this case makes no sense to me unless you have skin in the game so I'll assume you do.

As an exaggerated albeit relevant comparison; this is like saying you don't want police even though there are lots of criminals, you can always just walk away if things look suspicious. This assumes you have the eye to determine what is suspicious. I was hoping I wouldn't have to worry about crime in the first place.


Absolutely I have skin in the game. Do you never benefit from free content, tools or services that exist only because the opportunity to monetize through advertising is possible?

I display a single banner ad on a website that offers a free business tool, as an example.

I also do the same on a free business tool where I also offer a paid, advanced, ad-free version. If a user sticks around for 30 seconds, which most do (average time on both ad-supported sites is more than six minutes), then the freemium site pops up a message alerting them to the paid option.

No obligations and no restrictions on the free versions.

I don't make significant amounts from ads or subscriptions, but I would have no incentive beyond this to continue to offer these services, which many appear to find valuable and use for commercial purposes.

I frequent many free sites/tools that benefit from my visit, and I benefit from their offering for both business and personal reasons. I understand and agree to the transaction occurring.

Outlandish comparisons like you offer completely miss the mark and dilute the legitimate arguments for the use of ad-blockers, which I do believe exist. But I will offer an equally outlandish counterpoint: You prefer a world where over-policing would occur and round up innocent victims with criminals? "Most crimes are committed by males aged 18-25, if we round them all up, we will drastically reduce crime!" Hyperbole, I know. But probably more applicable than your argument for the use of ad blockers.

As I said before, I am not accusing anyone of wrongdoing. Using an adblocker allows for a cleaner, safer internet for the user. No doubt about that. It also, it has to be acknowledged, sweeps the good under the rug with the bad. Period. All-or-nothing enforcement is your proposition. Again, that simply has to be acknowledged. There is no debate there. If you believe that will ultimately lead to a better internet, then that is where we can disagree, as that is entirely subjective.

Marco said said it better than me: https://marco.org/2015/09/18/just-doesnt-feel-good


I knew you served ads :)

I'm not saying you're hiding anything it's just easy to see why you have this opinion. My example was not outlandish and is relevant. Vs the argument you made which was a purposefully dishonest analogy.

My hope is not to state that ads are evil as I don't believe that, just to point out that you are a person who serves ads, I also never state any of the opinions or beliefs you say I did, Have a nice day!


Did you think I would deny or hide that?

Are all ads, and are all sites that serve ads evil?

I know you visit sites that serve ads. And you may even block them, Gasp!


> But if the content is worth your time and energy to consume, pay the "price" of admission.

This assumes that the "time and energy to consume" is equivalent to the "price". What if it is worth the time to install 12ft or whatever, but not worth the price they want to charge?


I mean, sure, if you insist and make site-level negotiations with yourself about the value of the content.

Here’s a simple example for me:

I search Google for how to perform an operation in an Excel spreadsheet. I skip past the obvious ads at the top first. I click on a promising result on a user forum, but first have to click through a popup and then have a banner covering a third of the screen and a small inset screen with a video. That’s too much for me. I stop and go back to Google. I pick another option. And I may remember that forum is not worth the click in the future.

We make decisions like this online and offline every day. The fact is there are many valuable sites and services that are ad supported and done so responsibly. Not all, but many. Ad blockers are a blunt tool. Installing one on grandma’s browser is a constructive use, but not just because “ads are bad.”


^ This describes my experience as well. And there are certain outlets where I'll read an interesting article if someone links it, but don't want to give them money due to my objection with <xyz> editorial practices.


Paying for it doesn’t make the site less miserable to use. One of the stupid things about piracy is that it tends to also be the best available version of the thing. You’re actively worse off having paid for it. (Ads, denial, DRM in general, MBs of irrelevant JS, etc don’t go away with money, but do with piracy)


Case in point: paying for the New York Times doesn’t block ads in their app.


This right here. It would be nice to have some perk like you can read the articles through Gopher.


I agree, but would like for a way to pay for an article, or a single day, week, or month of access. Just like I could buy a single one-off issue of a publication a couple of times before starting a long term relationship with it. Not all publications support this, and some like the NY Times require chatting with a representative to cancel the subscription. I see a lot of talk about physical media around film and music, but not being able to buy single issues of any magazine or newspaper anonymously when the circumstances call for it, is a great loss for public discourse.


I feel like there were companies in the past that did try this, where you would chuck $5 or whatever in an account, and then each page you went to that supported the service would extract a micropayment from the account.

Never took off. Should have done. e.g. in Santa Cruz there is https://lookout.co , which is pretty good, but extremely pricy for what it is. There has to be a way between "pay and get everything", "ignore/go straight to 12ft.io".


As of 21 June 2023, there were 52,642 distinct sites submitted to the front page.

Counting those with 100 or more appearances, that falls to 149.

Doing a manual classification of news sites, there are 146.

Even at a modest annual subscription rate of $50/year ($1/week per source), that's a $7,300 subscriptions budget just to be able to discuss what's appearing on Hacker News from mainstream news sources.

Oh, and if you want per-article access at, say, $0.50 per article, that's $5,475 to read a year's worth of HN front-page submissions (10,950 articles/year), and that is just based on what is captured on the archive. In practice far more articles will appear, if only briefly, on the front page each day.

Which is among the reasons I find the "just subscribe" argument untenable. Some sort of bundling payment arrangement is required.

<https://news.ycombinator.com/item?id=36832354>

My alternative suggestion is global access to content through a universal content syndication tax, or fee assessed through ISPs, on a progressive basis. See:

"Why won't (some) people pay for the news?" (2022)

<https://diaspora.glasswings.com/posts/867c94d0ba87013aca4144...>

Discussed 4 days ago on HN: <https://news.ycombinator.com/item?id=41249571>


The Venn diagram of people who truly can't afford a New York times subscription, and even know what Docker is looks like two circles.


Countries outside of the US exist, some of them with extremely low incomes that nevertheless hold segments of the population that are technically competent enough to not only understand what Docker is, but to use it on a regular basis.


The NYT is from the US so framing the question this way is not surprising and drawing the comparison of someone who can't afford NYT but knows what docker is, is interesting without your addition.

There are other things we could mention like, maybe there are many people who can afford NYT but still don't want to pay for it, but that's not what we were talking about. That being said, thanks for the reminder about other countries... I'm sure everyone on HN forgot about globes.


You're also not entitled to free content.

If this was a project to let you watch Netflix or ESPN without a subscription the discord around it would be much different.

Is journalism really less valuable than Stranger Things.


I myself don't wish or want free content, I can create my own entertainment without the need of a large media corporation or a keyboard warrior spoon feeding me.

I don't think comparing Journalism to Netflix or ESPN is relevant since they provide quality entertainment (at least in the minds of their base) vs Journalists who stretch out 2 bits of searchable information into a 10 page ad puzzle full of psychological poking.

Yes, most journalism is less valuable than the critically acclaimed fantasy horror Stranger Things. This doesn't mean Journalism is less important or that good journalism doesn't exist. Honestly it's crazy to me Journalism doesn't see more critique. Most just sensationalize, fearmonger, and point fingers.


If you disagree on the value of a product you simply don't need to consume it.

NYT still has some quality journalism. It's not like that Huffington Post or something


It's not unusual for Americans to live insular lives where the rest of the world doesn't exist in their worldview. The globe snark is unnecessary and frankly not worthy of a HN comment.

And assuming that no one outside of the US could possibly be interested in US-oriented articles in the NYT - not to mention their world news - is just another example of the insular attitude I'm referring to.


Is there a way to pay for journalistic content that doesn't involve participating in the extensive tracking that those websites perform on their visitors?

I love to read the news but I don't love that the news reads me.


> Is there a way to pay for journalistic content that doesn't involve participating in the extensive tracking that those websites perform on their visitors?

Well you could buy physical newspapers/magazines. (Or access content via OTA TV / the library.)


I actually loled when I went to CNN with Little Snitch on, there were over one hundred different third-party domains it wanted to connect to


100%. And sometimes that form of payment is putting up with ads, etc. I routinely back out of sites that suddenly take over the screen with a popup or take up large chunks with video or animations. Same as opting not to go in a particular store. But I also stick around and occasionally use products advertised to me. Shocking, I know.


I fully agree with the sentiment! I support and do pay for sources I read frequently.

Sadly payment models are incompatible with how most people consume content – which is to read a small number of articles from a large number of sources.


No. Paywalled content should not be indexed by search engines. The implicit contract I have with the search engine is that it is showing me things that I can see. The publishers and search engines pulled a bait and switch here by whitelisting googlebot. So it's fair game to view the publisher's website with googlebot. That's what the engineers spent their time working on. It would be unfair to let that go to waste.


Yes, this.

It is an affront to the open web to serve one set of content to one person and different content to someone else (unless there is a user experience benefit to the customization I suppose).

I place most of the blame on the publishers for doing the bait and switch, but Google gets some blame too. They used to penalize website that sent googlebot different results (or at the very least they used to say that they penalized that). Now, they seem fine with it.


I dunno, it seems more like there should be a user-configurable setting to hide/show paywalled content.

If you're looking for something, and it's only available behind a paywall (even a paywall you pay for!), how are you going to find it if it's not indexed?


I know of very few sites that let you pay to get zero ads or affiliate links. The ones that let you pay still show you affiliate links.


I'll do that as soon as one-click-to-cancel becomes law. I refuse to subject myself to predatory business practices so they won't see my money until a legislative body starts working on behalf of the people.


It would cost me about $500/month if I subscribe to every paywall that appears in front of me.


I pay for the sites I visit regularly.

But when somebody shares an article with me and I want to see what I’ve been sent, I’m not going to buy a $15 monthly subscription to some small-town newspaper in Ohio just because they’ve decided to paywall their content in that way.


Counterpoint, meet me where I want to spend my money and I will. Not giving every publisher on the planet a sub.


Pay walls don't get folks there, how ever noble the sellers of information they to brand it.


I wasn't even thinking about paywalls, the first thing I did was check to see if cookie banners and "Sign in with Google" popups went away. There's so many user-unfriendly things that you constantly deal with, any amount of browsing is just a bad experience without putting up defenses like this.


as much as I circumvent paywalls myself, it does feel like overkill to setup software to do it always. Sites spend money to produce quality content.

Somewhat related comparison, Is a human choosing to do this theft really better than a neural network scraping content for its own purposes?


The neural network is not scraping content for its own purposes, it is for the purpose of the people who are running/training it.

And yes, one person reading a piece of content without paying money for it is far, far better than one person/corporation scraping all of the world's content in order to profit off of it.


> Is a human choosing to do this theft really better than a neural network scraping content

Probably so. I think the differentiation is in the scale at which scraping is done for those kinds of systems.


It’s probably about the same. The difference with sites like e.g. Perplexity is that they have a business model which requires “acquiring” said content for free whereas a single person is just a single person.


> Somewhat related comparison, Is a human choosing to do this theft really better than a neural network scraping content for its own purposes?

Here’s a similar comparison: “Is a human recording a movie at the theatre to rewatch at home really better than the one who shares the recording online?”

Seeing as you’re calling it “theft”, presumably what you mean by “better” is “causes less harm / loss of revenue to the work’s author / publisher”.

I’d say the answer is pretty clear. How many people go through the trouble of bypassing paywalls VS how many use LLMs?

Saying “a neural network scraping content for its own purposes” doesn’t even begin to tell the whole story. Setting aside the neural network is unlikely to be doing the scraping (but being trained on it), it’s not “for its own purpose”, it didn’t choose to willy nilly scrape the internet, it was ordered to by a human (typically) intending to profit from it.


why pay a monthly subscription if we're going to be bombarded by legally required popups and other internal promotional stuff that hooks you to the site anyway?


I will never willingly give a journalist my money.


Any journalist, or just specific journalists?

Sure, the journalism industry is progressively being replaced by paid activists, but not all journalists are like this.


I'm curious why. IMHO, they are true heroes.


From my experience, pihole is very easy to setup for this use case: https://pi-hole.net/




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: