ECH - if I understand correctly it's effective for sites hosted on big providers like Cloudflare, AWS, etc, but doesn't add much value when it comes to self-hosted domains or those on a dedicated server, as you'd still see traffic going to whatever IP and be able to infer from that which domain the user's browswer is talking to. I'm hoping someone can explain that I missed something.
And while we're explaining things... ODoH (indirectly mentioned in the article via the Encrypted DNS link) comes with a big bold warning it's based on the fundamental premise that the proxy and the target servers do not collude. When both are operated by the same company, how can you know they aren't colluding? Is there some mechanic in the protocol to help protect users from colluding servers?
> When both are operated by the same company, how can you know they aren't colluding?
You don't. At best the client can check domain names and IP addresses, but that's hardly a guarantee.
To solve that problem, you can combine multiple parties. For example, you can use https://odoh1.surfdomeinen.nl/proxy as a proxy (operated by SURF [1]) to use the Cloudflare servers for lookup.
I think for ODoH to work well, we need a variety of companies hosting forwarding services. That could be ISPs, Google/Microsoft/etc. or some kind of non-profit.
I don't know the implementation details, but it should be doable in a way that degrades back into encrypted DNS where at least you get rid of a MitM.
Someone else already mentioned that making sure that the 2 servers have different owners may help, but if people are after you it's probably not enough.
I'm thinking that maybe I'd like to be able to avoid mentioning the server I'm interested on, and simply send a hash of it (you can cut a prefix such that a bunch of matches are found, but not too many)
Yes, that's correct about ECH. In general, there's no real way to conceal your browsing behavior if you are connecting to an IP address that isn't shared. So either you use ECH to something like Cloudflare or you connect to some proxy/VPN/etc. so that the local network can't see the final IP address.
> ECH - if I understand correctly it's effective for sites hosted on big providers like Cloudflare, AWS, etc, but doesn't add much value when it comes to self-hosted domains or those on a dedicated server
Yeah, and unfortunately it increases the moat such companies have. They can offer a privacy screen that smaller orgs just can't match.
"This means that whenever a user visits a website on Cloudflare that has ECH enabled, no one except for the user, Cloudflare, and the website owner will be able to determine which website was visited. Cloudflare is a big proponent of privacy for everyone and is excited about the prospects of bringing this technology to life.'
This isn't privacy. This is centralized snooping.
It's like Google's approach to third party cookies. Nobody other than Google can have tracking information.
The internet is moving towards a place where it might not be possible to self-host anything important without getting DDoS'd. Companies like Cloudflare provide a solution to this problem, but that also creates a crutch that means no effort is expended to solve the problem at the root, which means the day may come when you don't have any option left other than Cloudflare.
I think these are important issues and worth talking about.
Those issues are absolutely worth discussing, in a reasonable way.
Cloudflare isn't the bad actor perpetuating these DDoS attacks, and they aren't forcing website operators to use their services either.
They don't need to be a bad actor. They just need to be big enough and follow their incentives.
Companies aren't binary good or bad. They go through a lifecycle. Today's young and scrappy startup fighting for the people and the CEO making house calls is tomorrow's big tech with AI chat support.
It's worth noting where a company is in its lifecycle, and what the world is likely to look like if it continues to grow.
So now the government needs to compel a corporation to hand over some data,
because they are no longer able to read it straight off the wire like they could before.
That sounds like a significant improvement to privacy.
People trafficking drugs into Australia were using a secure, encrypted messaging service developed by a private third party provider.
They eventually found out that the third party provider was in fact the Australian Federal Police, reading all their messages in clear and in real time.
The government only needs to compel a corporation if that corporation has an adversarial relationship with them.
We have tried the centralized model pushed by Cloudflare before, it was called the Minitel.
The latest Zstandard exposes several parameters which are useful for reducing time to first byte latency in web compression. They make Zstandard cut the compressed data into smaller blocks, e.g. 4 KB, with the goal to fit a compressed block within a small number of packets, so the browswer can start to decompress without waiting for a full 128 KB block to be sent.
These parameters are described in the v1.5.6 release notes [0]. ZSTD_c_targetCBlockSize is the most notable, but ZSTD_c_maxBlockSize can also be used for a lower CPU cost but larger compressed size.
Are you using these features at Cloudflare? If you need any help using these, or have any questions, please open an issue on Zstandard's GitHub!
> Zstandard’s branchless design is a crucial innovation that enhances CPU efficiency
Given how branchless algorithms are helping optimize not just network transport (compression) and even OS system libs (no citation for this one, but I’ve heard), that I really wish colleges begin teaching this along with DS/Algo course material.
DPI systems in Turkey weren't even checking QUIC packets when I was there, let alone ECH. But, browsers usually start with TCP first to negotiate QUIC support, which prevented bypass of web blocks. If you could force your browser to establish connection directly using QUIC, you could bypass all the blocks. That was last year though. Not sure about the current situation.
They usually won't bother with blocking if the site owner just hosts the site under a different domain name. It could be like news.ycombinator1923.com or something. wink
The benchmark for Zstandard against Brotli seems to miss a key information---the compression levels used for both algorithms, because both the compression ratio and compression time will depend on them. In fact this had been my long suspicion about introducing Zstandard to the web standard, because lower compression levels for Brotli are not that slow and it was never publicly mentioned whether improving lower Brotli levels deemed infeasible or not. Given Zstandard Content-Encoding was initially proposed by Meta, I'm not even sure they have at least tried.
Given we now have two strictly better algorithms than gzip, I also wonder about a hybrid scheme that starts with Zstandard but switches to Brotli when the compression time is no longer significant for given request. We might even be able to cheaply convert the existing Zstandard stream into Brotli with some restrictions, as they are really LZSS behind the scene?
Meta drove the Zstandard content encoding, but Google drove the adoption of Zstandard in Chrome.
The faster Brotli levels could probably be made to match Zstandard’s compression speed. But we’ve invested a lot in optimizing these levels, so it would likely take significant investment to match. Google is also contributing to improving the speed of Zstandard compression.
A cheaper conversion from Zstandard to Brotli is possible, but I wouldn’t really expect an improvement to compressed size. The encoding scheme impacts how efficient a LZ coding is, so for Brotli to beat Zstandard, it would likely want a different LZ than Zstandard uses. The same applies for a conversion from Brotli to Zstandard.
The conversion is meant to avoid the expensive backreference optimization when it has been already done once by Zstandard, because you can't prepend a Zstandard bitstream with a Brotli bitstream without turning one to another. But well, I think such hybrid scheme is hard to make when the latency is at stake.
It means nothing. Countries always ask nicely first for a domain to be blocked for IPs from their countries. Companies like Cloudflare or Akamai can either honor the request, or find their IP range blocked (yes, including all the other serviced domains). They usually take the first option.
There are many more countries enforcing some limitations on the internet, and ECH will just turn passive DPI into active court orders, I believe. At least explictness is better.
I don't think so? It's only seekable with an additional index [1], just like most other compression schemes. Having an explicit standard for indices is definitely a plus though.
Uh, I'm confused why you expect most compression schemes to be necessarily seekable after all.
I assume the seeking implies a significantly faster way to skip the first N bytes of the decompressed output without the actual decompression. "Significant" is here defined as a time complexity strictly less than linear time in the size of compressed or decompressed data in that skip, so for the RLE sequence like `10a90b`, seeking to the 50th byte should be able to avoid processing `10a` at all; reading `10a` but not actually emitting 10 copies of `a` doesn't really count.
Virtually every compression algorithm has two main parts, modelling and coding. Modelling either transforms the input into a more compressible form, or estimates how much is particular portion of the input likely. Those results are then coded into one or more sets of symbols with different probabilities, resulting in a compressed bit sequence. Both parts can be made as complex as needed and will likely to destroy any relation between the input and output byte. Many algorithms including Zstandard do come with some framing scheme to aid with data verification, but such framing is generally not enough for actual seeking unless each frame declares both the uncompressed and compressed size and can be independently decompressed from each other. That's what the Zstandard's seekable format actually does: both sizes for every frame, and an implicit promise that a new frame is generated every so often. (Zstandard frames are already defined to be independent.)
There do exist some compression schemes that retain the seekability by design. Texture compression is a good example; a massively parallel nature of GPUs means that a fixed-rate scheme can be much more efficient than a variable-rate and thus inherently sequential scheme. There are also some hybrids that use two interconnected algorithms for CPU and GPU respectively. But such algorithms are not a norm.
Right now, basically yes. No other major public clouds seem to support ECH yet, and ECH basically only works in public clouds; it can't hide your IP address, so it only provides privacy if you share your IP address with lots of other tenants.
What is the overlap of people who are reading a blogpost about Cloudflare standards and people who need a metaphor to understand what compression is? You have 7 paragraphs of highly technical information then just in case, you need to explain how compression works? Just tell your reader you think they're a moron and save yourself the keystrokes.
Network layer blocking is almost never in the interest of the end user. It's typically used to block users from accessing sites they want to visit, like The Pirate Bay, or recently Russian Times and Sputnik News.
End users who want to protect themselves can easily install blacklists on their end. All major browsers support something like Google Safe Browsing out of the box, and these blacklists are more likely to be kept up-to-date than those of the average ISP.
Either it's easy to block sites or it isn't. There's no world in which it's easier for you to block scam sites than it is for others to block vital resources and information.
The idea is to make ECH too large of a target to make blocking it practical. If you block ECH you end up blocking access to a large portion of the internet in that region. It's why some major browsers have chosen to not gracefully fallback to non-ECH handshakes upon connection failure.
Greetings, residents of Arstotzka! To access Arstotzkan government websites, please install this Ministry of Digits TLS root certificate on all your devices. Also, all new phones sold in Arstotzka must have the certificate preinstalled, starting from 2025.
Freedom of information is an existential threat to authoritarian states. There is no amount of money they're not willing to give up if it mean they stay in power.
That's said, it will not come to that. They'll just mandate spyware installation.
Many such countries already block traffic with ECH entirely. There's no technical solutions to a polical problem.
I remember when you can just change your DNS provider to bypass censorship. Nowadays, browsers and OS provide safe DNS by default, and thus censors had mostly switched to DPI based method. As this cat and mouse game continue, inevitably these governments will mandate spyware on every machine.
These privacy enhancements invented by westerner only work for western citizens threat model.
let the cat and mice game between deep packet inspection (DPI) vendors and the rest of the encrypted internet continue. it’ll be amusing to see what they come up with (inaccurate guessing game ai/ml “statistical analysis” is about all they’ve got left, especially against the large umbrella that is cloudflare).
game on, grab your popcorn, it will be fun to watch.
There's a relatively simple and pain-free solution to legitimate DPI: blocking all requests that don't go through a proxy. Browsers will ignore some certificate restrictions if they detect manually installed TLS root certificates to make corporate networks work.
This approach won't work on apps like Facebook or Instagram, but I don't think there's a legitimate reason to permit-but-snoop on that sort of traffic anyway.
Passive DPI/web filtering is pretty much done at this point. There's no way to tell what domain you're connecting to with ECH without doing a MITM and breaking the PKI chain or adding private CAs everywhere.
> This means that whenever a user visits a website on Cloudflare that has ECH enabled, no one except for the user, Cloudflare, and the website owner will be able to determine which website was visited.
So you must use entity which controls the DNS and this entity makes the request further for actual website. Feels like just worse VPN.
Trust isn't being moved, though. Cloudflare could, by design, always see what website you were accessing. The difference is with ECH, there is one less party (someone listening in on your internet traffic) that can see which hostname you're accessing.
I was more like comparing to VPN. My argument was poor. But on high level many do not use VPNs and they have very negative impact for service UX anyway these days, so not a big point.
Let me just stress that the effect of Zstandard on individual end-user latency is a rounding error. No user will ever go: “That was a quick loading web site. Must be Zstandard!”. The effect is solely Cloudflare having to spend x% less bandwidth to deliver the content, saving on their network and server resources.
If it saves them money, great. That also means resources saved, and that also means it's better for the planet, thus better for humanity. I'm failing to see the disadvantage.
And while we're explaining things... ODoH (indirectly mentioned in the article via the Encrypted DNS link) comes with a big bold warning it's based on the fundamental premise that the proxy and the target servers do not collude. When both are operated by the same company, how can you know they aren't colluding? Is there some mechanic in the protocol to help protect users from colluding servers?