Cloudflare disables access to ‘pirated’ content on its IPFS gateway

neilv · on March 25, 2023

Cloudflare is in the doghouse with me in recent weeks, because they started blocking my Firefox ESR from important Web sites (even when I disable uBlock Origin, FF Advanced Tracking Protection, and DoNotTrack).

Cloudflare is fine with Chromium from the same residential IP address, however, but I don't want to use Chromium for those Web sites.

I also don't want to have to try to debug the obnoxious behavior of some third-party company that perhaps doesn't care whether it's blocking legit users from its customers.

The Kafkaesque Cloudflare "prove you're a human" infinite loop is hopefully not a foreshadowing of an imminent Internet dystopia, with Cloudflare the vanguard of it.

mike_d · on March 25, 2023

I wish more of the HN crowd would understand that Cloudflare were never the good guys. Their business model has always been man-in-the-middle the entire internet.

Every site you put behind Cloudflare contributes to the future where they have the singular ability to decide what networks can connect to others.

kube-system · on March 25, 2023

It’s not like that’s a secret, it’s the whole point of their service.

hypertele-Xii · on March 25, 2023

The obvious sometimes needs repeating lest we forget the forest for the trees.

Tozen · on March 26, 2023

You are right, it does need repeating. Centralized control, means that at some point, they can decide to abuse that power to do whatever they want and at everyone else's expense. They can decide access, what can be seen, or what the "new" rules are and despite any objections.

MichaelZuo · on March 26, 2023

How could anyone who regularly uses the internet forget that Cloudflare is acting as an intermediary between X and Y?

unmole · on March 26, 2023

> Every site you put behind Cloudflare contributes to the future where they have the singular ability to decide what networks can connect to others.

Except for the fact that you can disable Cloudflare on your site anytime you choose.

paledot · on March 28, 2023

Yeah, but I can't disable it on your site.

kotaKat · on March 25, 2023

I get that loop daily with full on the latest Chrome with no extensions on a fully patched Mac from a residential IP address that isn't blacklisted anywhere (minus the usual ISP residential IP SMTP blacklist).

Totally frustrating that Cloudflare is basically the Internet's biggest gatekeeper with a mysterious black box behind it that punishes legitimate users.

Tozen · on March 26, 2023

Cloudflare is allowing itself to be used as an enforcement tool for third parties, which quickly becomes a rabbit hole for censorship. When the validity of claims made by third parties don't have to be legally proven, it becomes "Do as I tell you to!" Whoever or whatever we say censor, you do. Instead of going directly to the party they have an issue with, they simply use Cloudflare to do their bidding. They pull the strings, Cloudflare becomes the dumb censor tool. That's also what's going on with Quad9 (DNS resolver). Certain parties don't want to legally prove a claim, they want to force censorship whenever they say so.

> The Kafkaesque Cloudflare "prove you're a human" infinite loop is hopefully not a foreshadowing of an imminent Internet dystopia, with Cloudflare the vanguard of it.

The "prove you're a human" glitch, should be a really obvious fix, that lots of users have continuously complained about for a while. You would think that once proven human, that means access to the site. Somehow, Cloudflare has shown no interest in fixing it. In fact, they appear to have more interest in destroying user privacy and protections, using access to a site as the carrot. A lot of the webmasters seem to be getting overzealous or don't have a clear understanding of how they affect users. Cloudflare appears to not be helping, but rather pushing tools and settings to promote sales, so we get user nightmares like "prove you're a human" infinity glitches.

anacrolix · on March 26, 2023

It's only a matter of time until CloudFlare have a monopoly and so begin to abuse that. See Google Ads and search as examples. I think CloudFlare have already crossed that line, and we'll see them begin to exploit and abuse it more and more as time goes forward.

matt_heimer · on March 26, 2023

Its likely this: https://support.mozilla.org/en-US/kb/firefox-protection-agai...

You might be interested in a plugin to toggle the fingerprint protection: https://addons.mozilla.org/de/firefox/addon/toggle-resist-fi...

See also: https://news.ycombinator.com/item?id=34952279

neilv · on March 26, 2023

Thanks for the links. I actually have had `privacy.resistFingerprinting` set to false.

hydroid7 · on March 25, 2023

OMG, same here!

codetrotter · on March 25, 2023

> That said, this raises the question of how the IPFS gateway is different from Cloudflare’s DNS resolver, which essentially operates as a gateway to the regular Internet. Cloudflare previously said that it will fight copyright-related DNS blockades, even if they’re backed up by a court order.

> Apparently, that’s not the case for IPFS.

But it’s also not the same situation at all.

With DNS, blocking would mean making the whole site unavailable.

With their IPFS gateway they are just filtering individual items.

Caligatio · on March 25, 2023

I'm not a lawyer but my guess is it has to do with the liability of transferring "illegal" content. There's no content crossing Cloudflare servers for DNS resolutions but there definitely is for IPFS.

For IPFS vs CDN, I'm guessing relevant copyright laws give a service provider an out when it's a well-defined user/customer doing the copyright-related redistribution; no such agreement exists for content on IPFS.

EDIT: Trying to read the relevant DMCA sections and precedent makes me happy I'm not a lawyer. Cloudflare was previously sued for copyright infringement (Mon Cheri Bridals, LLC v. Cloudflare, Inc.) and found not guilty. In Cloudflare's own blog about the decision, they said the suit was meritless for several reasons including "our services are not even necessary for the content’s availability online." My guess is there is sufficient ambiguity over whether operating a inter-protocol gateway puts them at increased legal liability.

0xParlay · on March 25, 2023

Particularly curious since they also provide a Web3 gateway. While not copyright related does hosting/deploying a smart contract make them liable for possible illegal activity such as securities and state dept regs?

Guess I could just try deploying tornado-cash..

junon · on March 25, 2023

> There's no content crossing Cloudflare servers for DNS resolutions but there definitely is for IPFS.

Unless I'm missing some magical internet tomfoolery, that's not the case when you use their DNS proxy service, right?

zauguin · on March 25, 2023

As far as I understand it Cloudflare's DNS proxy service is for it's authoritative DNS service (and therefore only transfers content of Cloudflare customers), while the parent referred to the DNS resolver (1.1.1.1) where arbitrary sites are looked up.

thayne · on March 25, 2023

They probably mean the copyrighted material isn't crossing their servers. Unless you are somehow encoding it in DNS records...

For DNS specifically, it is analogous to saying a phone book shouldn't be liable for listing the phone number of someone who sells pirated DVDs.

Caligatio · on March 25, 2023

I think what you call "DNS proxy service" is what I'm referring to as their CDN. As referenced by a sibling comment, I meant their public recursive DNS resolver service when I said "DNS resolutions."

junon · on March 25, 2023

No, I'm not. I'm talking about the ability to completely hide your IP address from both DNS and network traffic.

Caligatio · on March 25, 2023

Ahh, you mean Cloudflare Tunnel then?

That's their CDN with a twist.

c7b · on March 25, 2023

> With DNS, blocking would mean making the whole site unavailable.

The site is still available, just not through its DNS entry.

> With their IPFS gateway they are just filtering individual items.

They can filter as broadly or as targeted as they want.

subbz · on March 25, 2023

True. For the IPFS case they're not recording the IP addresses for possible blackmail. But that's just an idea.

londons_explore · on March 25, 2023

It's a shame they don't provide a list of blocked content.

A public list of what their gateway will not retrieve for you would fuel the Streisand effect...

sva_ · on March 25, 2023

You can download the SQL tables including the hashes over at LibGen

hxxps://libgen.is/dbdumps/

moreresearchplz · on March 25, 2023

But those hashes are not identical to IPFS hashes as far as I know, sadly!

eurasiantiger · on March 25, 2023

Differential crawling to the rescue!

anecdotal1 · on March 25, 2023

Most people aren't aware that there are also like 20 different valid IPFS hash formats so they have to write code to generate blocks for all those variants or it is trivially bypassed. I had to do this for my IPFS gateway at my job.

capableweb · on March 25, 2023

AKA multibase + multihash AKA "self-describing base encodings" + "self-describing hashes" AKA https://github.com/multiformats/multibase + https://github.com/multiformats/multihash

RobotToaster · on March 25, 2023

I'm guessing this is targeted at libgen, I imagine some of the big publishers put pressure on them.

mhoad · on March 25, 2023

[flagged]

leoc · on March 25, 2023

They likely (IANAL) had a much firmer legal footing in those cases.

leoc · on March 25, 2023

Also, I should add: I don't think Cloudflare is letting IPFS or the IPFS movement or whatever down, here. The protocol was deliberately designed not to force users to touch illegal or otherwise "hot" content against their knowledge or against their wills, and on the other side of that coin it doesn't provide them much to hide behind legally or otherwise if they do host it. That question is effectively out of scope for base IPFS, and Cloudflare isn't doing anything unexpected or especially treacherous in complying with copyright takedown requests.

alwayslikethis · on March 25, 2023

Nazism is not illegal in the US. Copyright infringement is. Easier to stand up for your principles if government doesn't threaten you directly.

mschuster91 · on March 25, 2023

What I don't get is why all the tech companies don't put their money where their mouth is - and do the same that MAFIAA, radical Christians, Big Oil, Big Ag, Big Tobacco etc are doing: massively donate to young, tech focused aspiring politicians' campaigns, donate to the Democratic Party (or progressive parties in other countries), and otherwise engage in barely legal bribery.

We don't really need to wonder why we have octogenarians with zero clue how the Internet works in power (every Congressional investigation about anything tech is enough meme fodder for the rest of the year)... the tech industry might be absurdly large in financial terms, but it is a complete joke in political lobbying.

bauruine · on March 25, 2023

Because they don't care at all about progressive politics and their mouth just says what the most vocal people want to hear. It's all just woke washing.

Mountain_Skies · on March 25, 2023

Am I misreading your comment or are you seriously stating that the tech industry doesn't lavish support on Democrats?

mschuster91 · on March 25, 2023

On the 2022 midterm campaigns, the only tech-focused donor to Democrats was Sam Bankman-Fried of all people [1].

[1] https://thehill.com/lobbying/3720141-here-are-the-biggest-do...

tedivm · on March 25, 2023

Not every tech worker is a billionaire, or even millionaire. I would say more of the tech workers donating are doing it in much smaller amount, but the aggregate is likely fairly high.

That said most VCs aren't progressive, they're libertarian (until their bank crashes).

yucky · on March 25, 2023

And yet they tried to take down KiwiFarms for less.

ornornor · on March 25, 2023

Who is “they” in that context? Cloudflare? There was no CF or internet in the 30s, I don’t understand your comment.

Traubenfuchs · on March 25, 2023

Man.

https://www.vice.com/en/article/j5yxxg/cloudflare-is-protect...

ornornor · on March 25, 2023

Thanks. Wasn’t aware of this.

wahnfrieden · on March 25, 2023

[flagged]

tedivm · on March 25, 2023

Several actually. I've posted this before, but I've had issues with their security people essentially stalking me after I made some negative comments here awhile ago. It seems most of those people have been fired or have resigned after all the bad press, but for awhile their security team was pretty far right.

wahnfrieden · on March 25, 2023

Can you elaborate?

nibbleshifter · on March 25, 2023

Who?

I used to know a fair number of CF employees a while back.

Most were fairly liberal/left leaning, one was somewhat libertarian-right, but not what I'd describe as a "Nazi".

wahnfrieden · on March 25, 2023

The one that posted “I’m a nazi” among other things while employed there. I'm not being vague, that's the direct quote you can google. They were GNAA president (taking over for weev, the dailystormer admin), there's your ID.

weev also claimed he had another (different) sympathetic insider at cf supporting him with dailystormer or other activities, btw, but that's unverified

nibbleshifter · on March 25, 2023

Which one is that, specifically?

I dislike cloudflare myself, but if you are going to make an assertion about their employees, you should bring a source instead of vagueposting.

greyface- · on March 25, 2023

Jaime Cochran

https://twitter.com/Slendy5127/status/1565764927498903552/ph...

dmix · on March 25, 2023

According to LinkedIn they spent 2yrs at CF as a "Security Analyst" and left 6yrs ago

arp242 · on March 25, 2023

- Cloudflare offers hosting to everyone, including content such as this: Cloudflare are the Nazi fascists of the internet!

- Cloudflare makes decisions who to offer hosting to, refusing to offer hosting to Nazis and the like: Cloudflare are the Nazi fascists of the internet!

No matter what they do, they can never win; someone is always going to complain loudly.

deadbeeves · on March 25, 2023

One thing they can certainly be criticized for is having no consistent criteria for deciding who or what to provide service to.

arp242 · on March 26, 2023

What would that even look like? There are far too many variables and nuances to have a comprehensive set of written rules. That's why we have judges.

deadbeeves · on March 27, 2023

For example, "we will serve anyone who will pay, as long as we don't break the law in doing so". That seems easy enough to do, I'd say. If you can't get a service provider to tell you in clear terms under what circumstances they may cut you off, they're not a provider you should rely on.

arp242 · on March 27, 2023

That would be situation one from the earlier comment, and would end with "Cloudflare are the Nazi fascists of the internet!" from some people.

It's also what the Cloudflare policy is, AFAIK, barring a few notable extreme exceptions which can be counted on one hand – a miniscule portion of their user base. I think it's too strong to say "they're not a provider you should rely on".

deadbeeves · on March 28, 2023

If there are exceptions then they're not consistent.

arp242 · on March 29, 2023

That's just a boring platitude and doesn't contribute anything to any conversation. Things are complex and nuanced.

deadbeeves · on March 29, 2023

No. If Cloudflare's policy really is that they'll service anyone, and then they don't, that's called hypocrisy. Their actions are incongruent with their statements. There's no nuance that needs to be considered to criticize that behavior. They could just as easily be up-front: "we'll service anyone as long as it's convenient to us", or whatever condition they secretly hold.

shrimp_emoji · on March 25, 2023

A common symptom of having too much power

wahnfrieden · on March 25, 2023

They also hire nazis

Thorentis · on March 25, 2023

I'm amazed that cloudflare ever provided a public ipfs gateway to begin with. That said, I don't think ipfs is actually useful for anything besides being an interesting thought experiment. Every problem it aims to solve, another protocol or tool specific to that problem solves it better. There is no real world use case I think of where I would ever consider using ipfs.

daqhris · on March 25, 2023

Hello from a user of Cloudflare IPFS gateway! Satisfied and happy. For the moment, I use it as a redudant backup of my GitHub-hosted web gallery. https://ipfs.awalkaday.art. In case servers belonging to Github or Microsoft go down, the ipfs version of the same site is accessible. A peer-to-peer solution to hosting media content (in my case, an online photo gallery).

There is actually a conference of ipfs devs and users in Brussels going on (or held not long ago). https://2023.ipfs-thing.io/

Thorentis · on March 25, 2023

Do you know how many Ipfs nodes have actually pinned your content? Have they done so purely because they are ipfs enthusiasts? Do you feel this solution is better than hosting it on a raspberry pi from home or on a cheap VPS?

leoc · on March 25, 2023

There's a misunderstanding here. IPFS' content addressing exists to serve the interests of people who might want to save someone else's content, and of people who might want to find and access content which others have saved, much more than the interests of people who want to upload their own content and wouldn't mind a freebie. (That said, it does does also have advantages for uploaders who don't want to administer an Internet-facing Linux box and don't want to fiddle with DNS to move web hosts.)

acdha · on March 25, 2023

That doesn’t seem like a misunderstanding as much as the person you’re replying to has a deeper understanding of how the whole system works. That seems important to ask because as far as I can tell the person they were trying to was under the impression that it provides free hosting.

daqhris · on March 27, 2023

I don't have a place that I own or rent that's called home. I am living as an undocumented refugee, somewhere in Europe. Sorry, I can't rely on physical storage of information (except mobile/portable devices). The place in which I live at is not on the list of what I can control. That's mainly decided by government(s) or my social network members a.k.a friends.

I have no idea of who or what pinned the ipfs content. I really don't gather those statistics and have never tried to do so. The only "metric" which grabs my attention is that the content is widely distributed and that its reliably accessible via more than one way (not depend on a single point of failure) over large-scale electronic networks/protocols (HTTPS, DNS, IPFS, ETH, ENS..).

rglullis · on March 25, 2023

Why not both? Hosting content on a paid IPFS gateway is cheaper than any VPS ever will because of the enthusiasts, and if you still don't trust it enough you can have your own raspberry pi hosting/pinning your own files.

rgoulter · on March 25, 2023

> Every problem it aims to solve, another protocol or tool specific to that problem solves it better. There is no real world use case I think of where I would ever consider using ipfs.

Insofar as I understand IPFS, the most natural use case I've come across would be serving Nix packages.

That is, my understanding of IPFS is "protocols related to distributing content based on its content". Nix stores all its packages in a nix store, where its address there is a hash of the inputs; but, there are experiments for addressing Nix packages by the package's contents.

https://www.tweag.io/blog/2020-09-10-nix-cas/

On the Nix side of things: a cache-miss from a binary cache would just mean that a package would need to be built from source.

viraptor · on March 25, 2023

It solves some things that have an overlap with torrent 2. If you're interested only in single files, or collections of files which change but remain 99% the same, then ipfs is great and doesn't really have a widespread alternative right now.

3np · on March 25, 2023

How is IPFS better than torrents for changing files, considering that it uses content-based addressing?

viraptor · on March 25, 2023

Changing collections of files, not changing files. Let's say you have a collection of 1000 books. You get two more and publish a collection of 1002 books. With torrent, you'd have a completely separate download. With ipfs all previous users can now share the existing files as part of the new collection without doing anything special.

3np · on March 25, 2023

> With torrent, you'd have a completely separate download

That sounds more like a limitation of your bittorrent implementation. You think seeders have dozens of identical copies of each release hanging around on their drives just to be able to share it on separate trackers through separate torrents? :p

Or is there some new feature in IPFS that's an actual differentiator here? It's been a while since I was fully up to date on protocol development.

viraptor · on March 25, 2023

I don't think you understood the issue. Maybe read it again, or check the change in BitTorrent v2 which provides something similar as per-file trees https://blog.libtorrent.org/2020/09/bittorrent-v2/ (v2 support in clients in the wild is still very low though)

3np · on March 25, 2023

That was a bit tounge-in-cheek, cheers for complementing. Just to clarify, I think its fair to consider v2 if comparing with IPFS (which doesn't really have a large number of mature implementations either).

> BitTorrent v2 not only uses a hash tree, but it forms a hash tree for every file in the torrent. This has a few advantages:

> Identical files will always have the same hash and can more easily be moved from one torrent to another (when creating torrents) without having to re-hash anything. Files that are identical can also more easily be identified across different swarms, since their root hash only depends on the content of the file.

So I guess the question stands, how is IPFS supposedly the preferred protocol over Bittorrent (v2)?

viraptor · on March 25, 2023

Technically you can do the same thing for the transfer itself in v2. The preference may be for ergonomics where BT requires something like "download this file, from package with this hash, ask these hosts about it", while ipfs is more "yo all, who's got this hash".

You could make the experience similar though if you really wanted. So in theory, no difference. In practice, ipfs solved this years ago, has a nicer interface to it and put updateable naming on top for convenience.

colejohnson66 · on March 26, 2023

> while ipfs is more "yo all, who's got this hash".

Isn’t that the purpose of BitTorrent’s DHT?

wuiheerfoj · on March 25, 2023

It depends a little on how you arrange those files, but with UnixFS (the standard way to arrange them?) the files contents are arranged into a merkle DAG, so any shared files (and indeed chunks thereof) between the two versions would remain the same and thus still be ‘seedable’ by the parties that have it. It’s similar to BitTorrent in that regard. The root of content would get a new content address, but all the identical data below it in the merkle DAG would have the same content address as before

rglullis · on March 25, 2023

The one thing missing for me on IPFS is the ability to have ACL rules for access based on gateway id. When I was working next to the people from protocol labs, they mentioned the idea of having lists of friend peers, but that wasn't enough.

buildbuildbuild · on March 25, 2023

Quite the slippery slope. Cloudflare still hosts the IPFS-backed UI to Tornado Cash for example. (OFAC-sanctioned)

blfr · on March 25, 2023

I thought IPFS is rather poorly suited for piracy since the nodes' addresses are public. There is no onion routing or other protection.

Speaking of which, is Freenet still alive?

lordofgibbons · on March 25, 2023

Not sure, but I2P is alive, kicking, and growing!

anthk · on March 25, 2023

I2PD is not for piracy but to contact people in dangerous places.

codetrotter · on March 25, 2023

> We are building network which helps people to communicate and share information without restrictions.

> Free from censorship. Free from privacy violations.

https://i2pd.website/

I2P is neither “for” nor “not for” piracy. I2P is for protecting communications and data from any kind of censorship.

Filtering copyright violating content is a type of censorship. To some it is justified. To others it is not.

To a network protocol like I2P the point is that no kind of censorship should be possible. Regardless of reason.

charcircuit · on March 25, 2023

By that logic torrents are bad for piracy too.

dymk · on March 25, 2023

They are, you have to jump through hoops or live in the right country if you don’t want a DMCA notice from your ISP

WeylandYutani · on March 25, 2023

Get a seedbox. For about 20 Eurodollars a month you can pirate whatever you want 24/7 without having to deal with law firms or government interference.

I saw the writing on the wall when my ISP was forced to block TPB and KAT.

codetrotter · on March 25, 2023

What happens when the seedbox company is raided and their lists of customers and other data is confiscated?

A seedbox is a temporary protection. In the long term it is still a very real possibility that the customers of seedbox companies can get in trouble for their acts of piracy.

Several VPN companies who claimed to keep no logs, were actually keeping logs. https://www.pcmag.com/news/7-vpn-services-found-recording-us...

So too it will turn out that some seedbox companies will have been keeping records about their customers and their activities. For those customers, it will not be a happy day.

kodyo · on March 25, 2023

When you knowingly break the law, you're taking a risk.

It may be a stupid law, it may be an unjust law, but it's still a law and people with guns who enforce the stupid laws might show up some day.

Only individuals can decide whether or not the risk is worth it.

thomastjeffery · on March 25, 2023

Factoring that out is literally the point of having a seedbox.

thomastjeffery · on March 25, 2023

The most appealing feature of bittorrent (for piracy) is that you can do it at home with your own hardware, and not have to pay anyone anything.

Unfortunately, you need to anonymize your traffic (via a VPN service, which costs a fee) if you don't want to be harassed.

dymk · on March 26, 2023

These are exactly the hoops I’m talking about

mort96 · on March 25, 2023

Well, yeah.

loeg · on March 25, 2023

Huh, Cloudflare still fronts big pirate torrent sites. I guess it's not proxying content directly for that use.

denkmoon · on March 25, 2023

Do not use cloudflare if you value a free and decentralised internet.

marginalia_nu · on March 25, 2023

I'd love to not use Cloudflare, but to be real, about 2% of the traffic to my websites are from humans or bots good enough to get past cloudflare.

I can't afford to serve the 98% that is bots. That's about 10 search requests per second, most of them are search queries aimed at poisoning the query log of my search engine, which doesn't exist. But I think they're just scraping for opensearchdescription definitions and spamming every endpoint they can find in the hopes it's wired up to Google or Bing.

Cloudflare are the only ones that seem to offer some sort of (affordable) mitigation against this. They're hiding behind a botnet so rate limiting and ASN/IP blocking does all of bupkis.

The other option is to shut down my website. I believe that world is a worse world than one where stuff gets routed through Cloudflare, although I'm fully aware that this is far from ideal.

What I do to help those who want to avoid cloudflare is offer API access, which means I give you a token so I can rate limit you. Also means less anonymous of course, but it's at least free from men in the middle.

TekMol · on March 25, 2023

I'm trying to understand this.

What benefit do those bots get from "poisoning the query log"? And what benefit do they get from "spamming every endpoint ... wired up to Google or Bing"?

nibbleshifter · on March 25, 2023

Blackhat SEO methods are often a bit fucking daft.

So what's happening here could be one of two things:

Link injection, where by spamming an endpoint with a URL they hope the URL will end up indexed and displayed somewhere on the site, for a search engine to pick up later. Often with some keywords in the link text, etc. The idea is to boost pagerank.

The other is even sillier: they try search for their own sites a tonne on third party search things that probably query a "real" search engine, to try artificially boost the supposed popularity/keyword associations of their sites, boosting pagerank.

A lot of blackhat SEO stuff us bunkum and snake oil.

Every so often they figure out a working method to exploit search engines to boost pagerank, and then it gets fixed. But they never stop trying even the most obsolete methods as the automated scripts are what they have available.

BeFlatXIII · on March 25, 2023

> But they never stop trying even the most obsolete methods as the automated scripts are what they have available.

As described on the Security Now podcast, this is now part of "internet background radiation"

MandieD · on March 25, 2023

Oh, so that’s how a very niche amateur radio-related query can turn up a plausible-looking result in Google on a subdomain of some unrelated, not carefully-tended domain, but actually link to a page of spam.

thomastjeffery · on March 25, 2023

The entire category of "blackhat" isn't going through a filter of "this makes no sense, so don't bother".

The only place that filter exists is where people actually communicate openly about their strategies.

marginalia_nu · on March 25, 2023

Honestly, I'm not 100% sure about how the attack vector looks.

The fact of the matter is they're basically spamming very specific search queries and not really looking at the results.

But most larger search engines store and analyze search queries to for example produce typeahead suggestions or error corrections. You can conceivably poison this data to manipulate how a search engine operates, perhaps directing users to your website or affecting ad auctions.

I think the benefit of not directly spamming Google or Bing is that they're harder to identify.

TekMol · on March 25, 2023

How many of these requests do you get per day?

marginalia_nu · on March 25, 2023

Seems to be about a million queries/day right now. Varies a bit though. Peak figures is probably around the neighborhood of 10 mn/day.

seszett · on March 25, 2023

They try to make some searches seem more popular than they actually are, maybe to manipulate the rates of some ad keywords.

tedivm · on March 25, 2023

I've been running my own website since the early 2000s and while there have been occasional issues I've never felt the need to use cloudflare.

marginalia_nu · on March 25, 2023

Probably depends on what sort of website you're running. I wouldn't use it for my blog or anything that's reasonably static.

But this is a publicly available Internet search engine we're talking about. There's slightly more processing involved in processing a search query than loading a static file off a filesystem.

jrochkind1 · on March 25, 2023

Same for me, what you said.

anaganisk · on March 25, 2023

Main value proposition for non enterprise customers is, easy DDOS protection, edge caching of assets, DNS, domains and recently cloudflare workers which allow easy deployment of static sites. Webmasters and developers are an easy prey to at least DDOS protection and caching, their product is too good to avoid for many. I personally have a problem with cloudflare becoming a gatekeeper but at the end of the day, the value proposition is too good vs setting them up and managing yourself. Too many websites I use everyday have cloudflare in the front, and I just can't avoid them. Only because fighting spam is difficult for many.

koheripbal · on March 25, 2023

Exactly. Without Cloudflate, preventing DDoS is extremely time consuming and expensive.

tristor · on March 25, 2023

I value this, which is why I use Cloudflare. It’s the only reasonable economical way to self-host my site in the current time while following security best practices.

With TLS it’s arbitrarily easy to overwhelm a target site that’s hosted on the smallest instance types at $provider due to the compute requirements of terminating TLS. Thankfully Cloudflare + LetsEncrypt makes it economical to host a personal site with good security without arbitrarily and suddenly high bills, constantly high bills, or my site arbitrarily disappearing off the web.

Bad actors poisoned the well (thanks China), and now here we are. I, for one, appreciate greatly what Cloudflare has done for the indieweb.

Vecr · on March 25, 2023

Could you connect a bigger server at your house or wherever using Wireguard to a small VPS, and then don't terminate the TLS at the VPS, use haproxy in TCP mode or similar to copy the still TLS-encrypted TCP connections down the Wireguard pipe back to your house. In that case no one can DOS/DDOS your house (because they don't know your IP), and you can throttle the connections down the Wireguard pipe as well, see CAKE[0], with "Outbound, General Case" but with wg0 instead of eth0 done on the VPS.

https://www.bufferbloat.net/projects/codel/wiki/CakeRecipes/

tristor · on March 25, 2023

> Could you

Yes. I could.

This is one of several possible solutions available to me as an alternative. All alternative solutions to this problem either:

1. Are more expensive to implement, for various values of "more expensive".

2. Are more complicated to implement, for various values of "more complicated".

3. Are less resilient to the problem space, for various values of "less resilient".

In some cases, they are all three. The proposal you made happens to be all three.

If I were to do things all over again from scratch /today/, I might do things differently, simply because I'm more well off and have a more stable physical location than when I set up the last iteration of my personal site (which is still running), but none of the possibilities I can think of are strictly "better" than what I have already done, leveraging Cloudflare.

nake · on March 25, 2023

Highly recommend the CrimeFlare tool for determining the “real” IP of sites so you may bypass CloudFlare and connect to sites directly.

webmobdev · on March 25, 2023

Thank you! Been searching for a tool like this. It needs to be a browser extension to be more awesome.

bioemerl · on March 25, 2023

If you want me to not use cloudflare you're going to have to provide a solution to the fact that a DDOS falls on my responsibility.

I can't host a website for my home internet without the risk of getting black holed.

Can't ask the website for my home internet because Comcast won't let me.

Providers will charge you out the ass for bandwidth if someone ddoses you.

The internet just isn't a sustainable place for self run websites now,

akkartik · on March 25, 2023

I'm pretty sure there are hosting services who can just take your site offline if you exceed your bandwidth quota. AWS makes this complex, so just don't use AWS.

I wish residential providers would support a second level of quota management. Assign 90% of my bandwidth to this website, then take it down for the month without completely killing my internet.

So this isn't insurmountable. You don't need a centralized provider for protection from decentralized abuse.

vntok · on March 25, 2023

> I'm pretty sure there are hosting services who can just take your site offline if you exceed your bandwidth quota

Having my website instant-killable "for fun" by any bored script kiddy living in a country thousands of km away from me is not a particularly interesting proposition.

akkartik · on March 25, 2023

Have you ever actually had this problem? I haven't in 20 years of hosting a site.

naet · on March 25, 2023

I have had it happen many times for large sites. Always assumed it was some overseas business competitor or extortionist but we've been able to get things back up quickly or prevent it from bringing the site down completely.

There's also tons of sites that get the "HN hug of death" when they're posted here (or Reddit, or wherever else that generates a surge in traffic). Tragic to miss out on all those potential hits and easy to prevent with Cloudflare free.

You can dislike Cloudflare policies, but they do provide a valuable service and often completely free of charge. That's exactly why they have such a large slice of the internet under their domain. People don't just opt in for no reason.

akkartik · on March 25, 2023

> There's also tons of sites that get the "HN hug of death" when they're posted here (or Reddit, or wherever else that generates a surge in traffic). Tragic to miss out on all those potential hits and easy to prevent with Cloudflare free.

Sure. But I was originally responding to a comment that claimed the key reason to use Cloudflare was DDoS protection. If the traffic is real, it's not too unreasonable to pay for it.

Using any service comes with costs and benefits. Any reason you have to use a service irrespective of their behavior is just a blind spot.

vntok · on March 25, 2023

What alternative would you recommend to their security suite of tools? Note that the alternative has to "just work", be efficient, require minimal maintenance, and be free or close-to free to use (< $100/mo). Otherwise, it's just not competitive.

inimino · on March 25, 2023

The argument that a loss-leader or market takeover strategy should be resisted doesn't have to justify itself by claiming that an equally convenient alternative exists.

vntok · on March 25, 2023

A similar-enough alternative should exist in the first place, otherwise you're merely talking funny thoughts that have no bearing in the real world. Which is fine, it just needs to be explicitly conveyed.

Why would you expect customers to migrate someplace else at all if there's no alternative?

inimino · on March 25, 2023

The alternative before cloudflare existed was always DIY.

ehutch79 · on March 25, 2023

Can you share the DIY alternatives for DNS, DDoS protection, etc? At least the core cloudflare functions.

Of course, having DDoS protection on the machine being flooded is a non-starter.

DNS would have to be highly available as well.

And it all needs to be maintainable by someone who's main job is not maintaining infrastructure, and doesn't have the expertise. Assume hiring someone to do so is not in the budget.

toast0 · on March 25, 2023

Lots of low cost/free DNS primary/secondaries are out there. Personally, I use dns.he.net as my secondary, but they also offer primary. You can usually get DNS service for free from your domain registrar, although I'm not a fan of coupling those two services (it becomes much more difficult to switch registrars if your registrar also provides you other services)

DDoS management is harder. You just kind of need to assess your risk and take appropriate steps given your risk. If you're likely to attract real, determined, attacks, you need a good solution.

If you're going to just get bored idiots that control a lot of bandwidth, but don't have a real beef, accepting that you'll get null routed for some time when you get attacked is really the least hassle option. Get as big of an incomming connection as you can justify, make sure you can discard bogus packets at line speed, take steps so that you don't amplify repsonses, and cross your fingers.

Be aware that moving to new hosting while under attack can be difficult (most hosts do not like customers that attract DDoS, especially new customers), so if you do move, communicate the current situation to the new host beforehand.

If you have important services (APIs or what not), offer them on hostnames other than www or your apex domains. Idiots flooding random people for lols really like to hit www, and ignore your other hostnames.

ehutch79 · on March 25, 2023

This isn't really what a loss leader is. That's the Razor/Blades thing. Sell an xbox at a slight or moderate loss, so that you can make up margin on pricy games.

Free basic services arn't a loss leader, they're paid for by the enterprise contracts.

toast0 · on March 25, 2023

Loss leader gets you in the door, so you can sell the stuff with margin. For microsoft sure, that's the system/games model. For a grocery store, it's specials vs everything else. For cloudflare, it's free tier vs enterprise.

The loss leaders are always paid for elsewhere, or at least that's the plan.

greyface- · on March 25, 2023

SSL added and removed here! :-)

yencabulator · on March 25, 2023

Just like every other CDN and cloud service you use to host your services.

Also, https://www.cloudflare.com/ssl/keyless-ssl/

vntok · on March 25, 2023

Things have changed quite a lot over the past 10 years.

concinds · on March 25, 2023

Where's the centralization concern? Libgen sites can just switch to non-Cloudflare IPFS gateways. I don't see how this move causes any harm. Cloudflare's decision impacts nothing, precisely because IPFS is decentralized.

nvr219 · on March 25, 2023

So are you `dig ns`ing every website before you visit it or what

counttheforks · on March 25, 2023

The cloudflare "Are you even human?!" dialog is clear enough

ehutch79 · on March 25, 2023

I almost never see those. I definitely haven't seen on in the last 5 or so years.

Not saying you don't. Saying your experience is not everyone's experience here.

almost_usual · on March 25, 2023

I only see it for ChatGPT.

favaq · on March 25, 2023

Perhaps this could be achieved by blacklisting the cloudflare parent certificate in your browser?

GartzenDeHaes · on March 25, 2023

It looks a lot like a billion dollar a year protection racket. Who else has a financial incentive for all these ddos bots other than cloudflare?

ehutch79 · on March 25, 2023

You don't think anyone else has any reason to launch a DDoS? That it's all done, even partially, by cloudflare? that's a hell of an accusation

slackfan · on March 25, 2023

People have run experiments, Cloudflare is absolutely a protection racket.

ehutch79 · on March 26, 2023

There’s evidence that cloudflare is what, threatening to ddos people if they don’t sign up?

What would need an experiment to prove that? You would just have the proverbial receipts

Or do you have proof that they’re actually ddos in people?

slackfan · on March 26, 2023

Step 1: start up a noname site, ensure you are not indexed, do not advertise

Step 2: Enable cloudflare, paid tier, hang around with it turned on for a month or so

Step 3: Simulate some usage traffic

Step 4: Cancel your cloudflare contract

Step 5: Watch your unknown site get ddosed

Coincidence? Suuuuuure

Cloudflare itself doesnt need to ddos you, plenty of ddos for hire outfits out there that will happily do it.

As to who hires them, well that's between them and their customers. But there are some things that are not coincidences.

ehutch79 · on March 26, 2023

Ok, if people have already done this, please link to their findings. This is a big deal if true.

ehutch79 · on March 26, 2023

Please provide links.

tpmx · on March 25, 2023

This will probably hit winworldpc.com :(

rnd0 · on March 26, 2023

To be fair, IPFS has always been flakey there. For the past year I haven't even bothered trying it.

rasengan · on March 25, 2023

IPFS was also one of the ways LLAMA was being well distributed. This is really a sad day for “people”

1attice · on March 25, 2023

I thought of this immediately too.

I know that Meta is doing its level best to wipe LLaMA's weights from the face of the Internet, but I also suspect (and cannot prove) that various organs of the US state (CIA, NSA, FBI) are also trying.

If they're not, I'm even more afraid, because one of their most important jobs is being alarmed so I don't have to be.

I'd wager a few dollars that Cloudflare had a visit.

And, as an aside, I'd say it's a matter of when, not if, a GPT-4-scale model is leaked, possibly GPT-4 itself. If not to the open Internet, then to China.

OpenAI is (for now) staffed by humans, and humans are vulnerable to seduction, spearphishing and spycraft.

Zuiii · on March 28, 2023

lol what are you talking about? The weights are being linked to on Meta's repo right now: https://github.com/facebookresearch/llama/pull/73/files

The idea that Meta is frantically running around trying to "wipe" these weights is ridiculous. They're either doing this to disclaim liability in case a new law imposes it retroactively, or they are trying to create uncertainty around its use to dissuade serious US competitors from using it for their own products.

> but I also suspect (and cannot prove) that various organs of the US state (CIA, NSA, FBI) are also trying.

I would be very surprised if they were given that it would only handicap the US while leaving other countries to advance unabated. Again, the only real thing facebook did here is show that we can have smaller models that are competitive to larger ones. The information to build and train these models is public. The code to do this is public. The capability to this so is public.

The cat is out of the bag and humanity is better off for it. What humanity needs to do now is figure out how best to adapt.

1attice · on March 29, 2023

The tracker for that torrent was taken offline by Meta. There was a story about it on HN a while ago. Don't believe me? Try to torrent.

Do homework. SMH.

Dwedit · on March 25, 2023

IPFS is basically bittorrent. Not participating makes perfect sense.

anacrolix · on March 26, 2023

That's a discredit to BitTorrent.

JeremyNT · on March 25, 2023

I never understood why this was allowed for so long.

IPFS is primarily used to store infringing content, yet few users directly interacted with IPFS. Instead, they usually rely on Cloudflare basically serving as a free proxy.

It's like the early days of mega or something, just with a layer of indirection that serves as plausible deniability.

Why Cloudflare bothers to provide this presumably costly service at all is confusing to me. I guess maybe they drank the crypto kool-aid and thought that it might some day be profitable?

nyolfen · on March 25, 2023

ipfs is not a crypto project

acdha · on March 25, 2023

Kind of but there’s a fairly large community overlap because they were trying to use FileCoin to support development, and a lot of cryptocurrency projects wanted to use it to work around technical limitations (e.g. many NFTs referenced IPFS gateways) to the point that I’ve had a number of marketing people use it as an example when asked for an example of a “web3” project which does something useful.

nyolfen · on March 25, 2023

web3 is not crypto either (crypto is merely a constituent metaproject). these are important distinctions, like saying web2 is a database or something

acdha · on March 25, 2023

web3 is a marketing term VCs started boosting when they realized that cryptocurrency had a bad reputation which was deterring sales. Unlike Web 2.0 it wasn’t giving you new capabilities but trying to build demand for things which had already failed in the market.

nyolfen · on March 26, 2023

it is a floating signifier for a variety of projects and technologies, some of which include crypto -- the unifying goals are decentralization and data ownership but these do not require crypto. if you think it is just a marketing gimmick (which it is, among many other things), i will let you continue to enjoy fb et al

acdha · on March 26, 2023

It’s a cryptocurrency marketing campaign. The people pushing the term all got their money from cryptocurrency speculators, the apps being promoted all require cryptocurrency payments, and they share the similar poor technical and economic understanding of the fields they’re trying to enter.

That doesn’t mean I love the way that Facebook influences the world – quite the opposite, but that means I have even less tolerance for people hawking get rich quick scams as the solution since they’re distracting attention and resources away from things which can possibly work. It’s similar to how cryptocurrency salespeople talk about “the unbanked” as if the solution to that problem is to make a bunch of rich people in the developed world even richer in the hopes that they’ll share some of that largesse.

miohtama · on March 25, 2023

IPFS is also not used to store content. It’s peer-to-peer distribution network similar to Bittorrent.

JeremyNT · on March 25, 2023

But filecoin is.

IPFS is crypto adjacent and got plenty of hype because of it.

nyolfen · on March 25, 2023

bitcoin uses sha256, is that crypto?