Hacker News new | past | comments | ask | show | jobs | submit login
New standards for a faster and more private Internet (cloudflare.com)
170 points by terrelln 5 months ago | hide | past | favorite | 84 comments



ECH - if I understand correctly it's effective for sites hosted on big providers like Cloudflare, AWS, etc, but doesn't add much value when it comes to self-hosted domains or those on a dedicated server, as you'd still see traffic going to whatever IP and be able to infer from that which domain the user's browswer is talking to. I'm hoping someone can explain that I missed something.

And while we're explaining things... ODoH (indirectly mentioned in the article via the Encrypted DNS link) comes with a big bold warning it's based on the fundamental premise that the proxy and the target servers do not collude. When both are operated by the same company, how can you know they aren't colluding? Is there some mechanic in the protocol to help protect users from colluding servers?


> When both are operated by the same company, how can you know they aren't colluding?

You don't. At best the client can check domain names and IP addresses, but that's hardly a guarantee.

To solve that problem, you can combine multiple parties. For example, you can use https://odoh1.surfdomeinen.nl/proxy as a proxy (operated by SURF [1]) to use the Cloudflare servers for lookup.

I think for ODoH to work well, we need a variety of companies hosting forwarding services. That could be ISPs, Google/Microsoft/etc. or some kind of non-profit.

[1]: https://www.surf.nl/en


> That could be ISPs, Google/Microsoft/etc. or some kind of non-profit.

Or Apple[1,2].

[1] Oblivious DNS over HTTPS, https://www.ietf.org/rfc/rfc9230.txt

[2] About iCloud Private Relay, https://support.apple.com/en-us/102602


I don't know the implementation details, but it should be doable in a way that degrades back into encrypted DNS where at least you get rid of a MitM. Someone else already mentioned that making sure that the 2 servers have different owners may help, but if people are after you it's probably not enough.

I'm thinking that maybe I'd like to be able to avoid mentioning the server I'm interested on, and simply send a hash of it (you can cut a prefix such that a bunch of matches are found, but not too many)


Yes, that's correct about ECH. In general, there's no real way to conceal your browsing behavior if you are connecting to an IP address that isn't shared. So either you use ECH to something like Cloudflare or you connect to some proxy/VPN/etc. so that the local network can't see the final IP address.


> ECH - if I understand correctly it's effective for sites hosted on big providers like Cloudflare, AWS, etc, but doesn't add much value when it comes to self-hosted domains or those on a dedicated server

Yeah, and unfortunately it increases the moat such companies have. They can offer a privacy screen that smaller orgs just can't match.


"This means that whenever a user visits a website on Cloudflare that has ECH enabled, no one except for the user, Cloudflare, and the website owner will be able to determine which website was visited. Cloudflare is a big proponent of privacy for everyone and is excited about the prospects of bringing this technology to life.'

This isn't privacy. This is centralized snooping.

It's like Google's approach to third party cookies. Nobody other than Google can have tracking information.


> This isn't privacy.

It will be when everyone adopts ECH. It's a fantastic start.


Doesn't Cloudflare decrypt and MITM all traffic? In that case they can snoop on everything.


Only true when the entire Internet is routed through Cloudflare. (A majority probably will, but the GP surely meant this.)


Another HN hot take about the Cloudflare bogeyman.

The CDN can't give you content you're asking for without knowing which content you're asking for.

This improvement prevents your ISP and the government from reading your packets to get that same information.


Especially since, as another top comment put it, ECH only gives privacy benefits if the serving IP is serving multiple domains.

I'm all for being wary of large-scale consolidation, but I feel like these lazy gripes aren't assessing the pros and cons dispassionately.


The internet is moving towards a place where it might not be possible to self-host anything important without getting DDoS'd. Companies like Cloudflare provide a solution to this problem, but that also creates a crutch that means no effort is expended to solve the problem at the root, which means the day may come when you don't have any option left other than Cloudflare.

I think these are important issues and worth talking about.


Those issues are absolutely worth discussing, in a reasonable way. Cloudflare isn't the bad actor perpetuating these DDoS attacks, and they aren't forcing website operators to use their services either.


They don't need to be a bad actor. They just need to be big enough and follow their incentives.

Companies aren't binary good or bad. They go through a lifecycle. Today's young and scrappy startup fighting for the people and the CEO making house calls is tomorrow's big tech with AI chat support.

It's worth noting where a company is in its lifecycle, and what the world is likely to look like if it continues to grow.


> Cloudflare isn't the bad actor perpetuating these DDoS attacks

How do we know that? Who else benefits from most of them?


What makes you believe CloudFlare wouldn’t do this? They may have state actor employees or be compelled by a government to surveil users.


So now the government needs to compel a corporation to hand over some data, because they are no longer able to read it straight off the wire like they could before. That sounds like a significant improvement to privacy.


People trafficking drugs into Australia were using a secure, encrypted messaging service developed by a private third party provider.

They eventually found out that the third party provider was in fact the Australian Federal Police, reading all their messages in clear and in real time.

The government only needs to compel a corporation if that corporation has an adversarial relationship with them.

We have tried the centralized model pushed by Cloudflare before, it was called the Minitel.


Sounds like an interesting story. Do you have a link to a summary?


> The CDN can't give you content you're asking for without knowing which content you're asking for.

Maybe some PIR protocol can also eventually change this (if the users and Cloudflare don't mind the computational and network overhead!).


The latest Zstandard exposes several parameters which are useful for reducing time to first byte latency in web compression. They make Zstandard cut the compressed data into smaller blocks, e.g. 4 KB, with the goal to fit a compressed block within a small number of packets, so the browswer can start to decompress without waiting for a full 128 KB block to be sent.

These parameters are described in the v1.5.6 release notes [0]. ZSTD_c_targetCBlockSize is the most notable, but ZSTD_c_maxBlockSize can also be used for a lower CPU cost but larger compressed size.

Are you using these features at Cloudflare? If you need any help using these, or have any questions, please open an issue on Zstandard's GitHub!

[0] https://github.com/facebook/zstd/releases/tag/v1.5.6


> Zstandard’s branchless design is a crucial innovation that enhances CPU efficiency

Given how branchless algorithms are helping optimize not just network transport (compression) and even OS system libs (no citation for this one, but I’ve heard), that I really wish colleges begin teaching this along with DS/Algo course material.


what’s DS stand for? discrete structures?


Data Structures probably

Data Structures & Algorithms is also sometimes abbreviated as DSA


As the sibling comment mentioned already: Data Structures & Algorithms.


New standards for easier TLS fingerprinting and user-agent discrimination.

Edit: just look at how many sites you're locked out of if you don't have JS enabled or run an uncommon configuration.


DPI systems in Turkey weren't even checking QUIC packets when I was there, let alone ECH. But, browsers usually start with TCP first to negotiate QUIC support, which prevented bypass of web blocks. If you could force your browser to establish connection directly using QUIC, you could bypass all the blocks. That was last year though. Not sure about the current situation.


They usually won't bother with blocking if the site owner just hosts the site under a different domain name. It could be like news.ycombinator1923.com or something. wink


The benchmark for Zstandard against Brotli seems to miss a key information---the compression levels used for both algorithms, because both the compression ratio and compression time will depend on them. In fact this had been my long suspicion about introducing Zstandard to the web standard, because lower compression levels for Brotli are not that slow and it was never publicly mentioned whether improving lower Brotli levels deemed infeasible or not. Given Zstandard Content-Encoding was initially proposed by Meta, I'm not even sure they have at least tried.

Given we now have two strictly better algorithms than gzip, I also wonder about a hybrid scheme that starts with Zstandard but switches to Brotli when the compression time is no longer significant for given request. We might even be able to cheaply convert the existing Zstandard stream into Brotli with some restrictions, as they are really LZSS behind the scene?


Meta drove the Zstandard content encoding, but Google drove the adoption of Zstandard in Chrome.

The faster Brotli levels could probably be made to match Zstandard’s compression speed. But we’ve invested a lot in optimizing these levels, so it would likely take significant investment to match. Google is also contributing to improving the speed of Zstandard compression.

A cheaper conversion from Zstandard to Brotli is possible, but I wouldn’t really expect an improvement to compressed size. The encoding scheme impacts how efficient a LZ coding is, so for Brotli to beat Zstandard, it would likely want a different LZ than Zstandard uses. The same applies for a conversion from Brotli to Zstandard.


The conversion is meant to avoid the expensive backreference optimization when it has been already done once by Zstandard, because you can't prepend a Zstandard bitstream with a Brotli bitstream without turning one to another. But well, I think such hybrid scheme is hard to make when the latency is at stake.


What will ECH mean for places like China or South Korea? Do governments have access to Cloudflare logs? Only with court orders?

ECH seems directly opposed to Chinese governments control of the web.


I think you meant North Korea, not South.

It means nothing. Countries always ask nicely first for a domain to be blocked for IPs from their countries. Companies like Cloudflare or Akamai can either honor the request, or find their IP range blocked (yes, including all the other serviced domains). They usually take the first option.


> I think you meant North Korea, not South.

South Korea is infamous for their internet censorship.

https://en.wikipedia.org/wiki/Internet_censorship_in_South_K...


There are many more countries enforcing some limitations on the internet, and ECH will just turn passive DPI into active court orders, I believe. At least explictness is better.


Cloudflare is happy to make it harder for anyone other than Cloudflare to see everything that you're doing on the internet.


Don't trust cloudflare with standards control.

They do not have anybody else's best interests at heart and are actively centralizing that which was explicitly intended to not be centralized.


I use Tor for privacy.

CF blocks Tor; you can't get past the captcha.


Pffff you think you're edgy and hardcore.

I use the elite hacking tool know as FIRE FOX and I get CF gatekept all the time.


A very nice feature of zstd is that it is seekable. So you could map that to HTTP Range requests and go crazy about it.


I don't think so? It's only seekable with an additional index [1], just like most other compression schemes. Having an explicit standard for indices is definitely a plus though.

[1] https://github.com/facebook/zstd/blob/dev/contrib/seekable_f...


>just like most other compression schemes

Source?


Uh, I'm confused why you expect most compression schemes to be necessarily seekable after all.

I assume the seeking implies a significantly faster way to skip the first N bytes of the decompressed output without the actual decompression. "Significant" is here defined as a time complexity strictly less than linear time in the size of compressed or decompressed data in that skip, so for the RLE sequence like `10a90b`, seeking to the 50th byte should be able to avoid processing `10a` at all; reading `10a` but not actually emitting 10 copies of `a` doesn't really count.

Virtually every compression algorithm has two main parts, modelling and coding. Modelling either transforms the input into a more compressible form, or estimates how much is particular portion of the input likely. Those results are then coded into one or more sets of symbols with different probabilities, resulting in a compressed bit sequence. Both parts can be made as complex as needed and will likely to destroy any relation between the input and output byte. Many algorithms including Zstandard do come with some framing scheme to aid with data verification, but such framing is generally not enough for actual seeking unless each frame declares both the uncompressed and compressed size and can be independently decompressed from each other. That's what the Zstandard's seekable format actually does: both sizes for every frame, and an implicit promise that a new frame is generated every so often. (Zstandard frames are already defined to be independent.)

There do exist some compression schemes that retain the seekability by design. Texture compression is a good example; a massively parallel nature of GPUs means that a fixed-rate scheme can be much more efficient than a variable-rate and thus inherently sequential scheme. There are also some hybrids that use two interconnected algorithms for CPU and GPU respectively. But such algorithms are not a norm.


>I'm confused why you expect most compression schemes to be necessarily seekable after all.

Wtf, you were the one who brought that up.


What? Reading my comment just in case, most other compression schemes are "only seekable with an additional index", thus not seekable in general.


I never said compression schemes have to be seekable, you were the one arguing most other compression schemes are seekable.


Does it mean ECH works only with the Cloudflare since their example ECH contains unencrypted outer layer client hello?


No, it's am emerging standard. We are just pushing its adoption as fast as we can. Hence, we've rolled this out to all free customers.


And for non free customers, you can opt-in to ECH via the dashboard


Right now, basically yes. No other major public clouds seem to support ECH yet, and ECH basically only works in public clouds; it can't hide your IP address, so it only provides privacy if you share your IP address with lots of other tenants.


What is the overlap of people who are reading a blogpost about Cloudflare standards and people who need a metaphor to understand what compression is? You have 7 paragraphs of highly technical information then just in case, you need to explain how compression works? Just tell your reader you think they're a moron and save yourself the keystrokes.


After examining how scammers and phishers host their sites, I’ve realised that “private” for Cloudflare means protecting the privacy of criminals.

ECH makes it hard to block known scam sites at the network layer, for example.


Network layer blocking is almost never in the interest of the end user. It's typically used to block users from accessing sites they want to visit, like The Pirate Bay, or recently Russian Times and Sputnik News.

End users who want to protect themselves can easily install blacklists on their end. All major browsers support something like Google Safe Browsing out of the box, and these blacklists are more likely to be kept up-to-date than those of the average ISP.


Either it's easy to block sites or it isn't. There's no world in which it's easier for you to block scam sites than it is for others to block vital resources and information.


ECH is going to be huge for people in regressive countries. For example Iran.


Nah, they're just going to block the whole ECH handshake.

Idk about Iran, but Russia and China just block eSNI, QUIC and whatever their DPI firewalls can't really handle on the fly.


The idea is to make ECH too large of a target to make blocking it practical. If you block ECH you end up blocking access to a large portion of the internet in that region. It's why some major browsers have chosen to not gracefully fallback to non-ECH handshakes upon connection failure.


Greetings, residents of Arstotzka! To access Arstotzkan government websites, please install this Ministry of Digits TLS root certificate on all your devices. Also, all new phones sold in Arstotzka must have the certificate preinstalled, starting from 2025.



I think the other poster was implying that the governments don’t care.


Disagree on this take. Blocking services does have an economic impact.

This alongside people smuggling in starlink is making censorship useless.


Freedom of information is an existential threat to authoritarian states. There is no amount of money they're not willing to give up if it mean they stay in power.

That's said, it will not come to that. They'll just mandate spyware installation.


China blocks services all the time. I was one of the original 10 blocked by the great firewall of china.

And starlink can be traced. It’s only time before some people start getting arrested.


I’m not talking about China. China has well made internal alternatives to most western services.

Iran does not.


Yeah we shall see - we're monitoring closely


Many such countries already block traffic with ECH entirely. There's no technical solutions to a polical problem.

I remember when you can just change your DNS provider to bypass censorship. Nowadays, browsers and OS provide safe DNS by default, and thus censors had mostly switched to DPI based method. As this cat and mouse game continue, inevitably these governments will mandate spyware on every machine.

These privacy enhancements invented by westerner only work for western citizens threat model.


re: ECH

let the cat and mice game between deep packet inspection (DPI) vendors and the rest of the encrypted internet continue. it’ll be amusing to see what they come up with (inaccurate guessing game ai/ml “statistical analysis” is about all they’ve got left, especially against the large umbrella that is cloudflare).

game on, grab your popcorn, it will be fun to watch.


There's a relatively simple and pain-free solution to legitimate DPI: blocking all requests that don't go through a proxy. Browsers will ignore some certificate restrictions if they detect manually installed TLS root certificates to make corporate networks work.

This approach won't work on apps like Facebook or Instagram, but I don't think there's a legitimate reason to permit-but-snoop on that sort of traffic anyway.


Passive DPI/web filtering is pretty much done at this point. There's no way to tell what domain you're connecting to with ECH without doing a MITM and breaking the PKI chain or adding private CAs everywhere.



> New standards for a faster and more private Internet

> Zstandard

I get "faster" but how does it make the internet "more private". The word "private" only shows up exactly once on that page, in the title.


I believe that the "more private" part is referencing the "Encrypted Client Hello (ECH)" section in the later part of the post.


It is about moving the trust.

> This means that whenever a user visits a website on Cloudflare that has ECH enabled, no one except for the user, Cloudflare, and the website owner will be able to determine which website was visited.

So you must use entity which controls the DNS and this entity makes the request further for actual website. Feels like just worse VPN.


> It is about moving the trust.

Trust isn't being moved, though. Cloudflare could, by design, always see what website you were accessing. The difference is with ECH, there is one less party (someone listening in on your internet traffic) that can see which hostname you're accessing.


I was more like comparing to VPN. My argument was poor. But on high level many do not use VPNs and they have very negative impact for service UX anyway these days, so not a big point.


The title of something should reflect the content. This is an article about a new compression format, and thus the title should say that.


The first third of the article is indeed, maybe read the rest?


I read really, really far into the article.

The summary at the top made it look like the whole article was about compression.


>The word "private" only shows up exactly once on that page, in the title.

However, the word "privacy" shows up 10 times in the article.


They also talk about Encrypted Client Hello (ECH).


Let me just stress that the effect of Zstandard on individual end-user latency is a rounding error. No user will ever go: “That was a quick loading web site. Must be Zstandard!”. The effect is solely Cloudflare having to spend x% less bandwidth to deliver the content, saving on their network and server resources.


If it saves them money, great. That also means resources saved, and that also means it's better for the planet, thus better for humanity. I'm failing to see the disadvantage.


GP didn’t say it has a disadvantage. Just explaining what it means




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: