An interesting idea, but QUIC / HTTP/3 also avoids the extra RTT for TLS negotiation by bundling it with the connection handshake and in a less janky way than this. I don't see a good reason for a server or browser developer to implement this when QUIC exists.
TLS is used for other protocols, e.g., SMTPS (SMTP + TLS). But there's an extra DNS query for this case, and I don't think TLS setup time is a significant cause of delays. So I don't know how useful this is.
The latency hit of the extra trip will definitely be felt by end users if the endpoint is far enough away (e.g. ~100-200ms.) You can mitigate the initial setup other ways though, like the CDN approach: terminate TLS much closer with a proxy and use a warm, pre-established backhaul connection to the origin.
Less round trips are always good, though, without any extra stuff to put in place.
Having debugged stuff for someone crazy who wanted less than 50ms global latency for a private search engine I can tell you the TLS negotiation adds significant time the first time you initialize the connection
I've designed my own private CDN and unless you have edges near each geo, you're not getting that kind of latency. And even after TLS negotiation you have no idea what the MSS on a link along the way will be causing fragmentation and retransmits. I've made do with some shenanigans with Noise and pre shared certs and have tried to make payload sizes as small as possible.
Hah gotcha. Hope you got paid well for that. I do it for my own pleasure, but it's a lot of work to do as the internet just isn't designed for low latency interconnect.
emphasis on wanted. And it was quite obtainable depending on geo location. Some locations 70 was best we could do, but most of the US east coast we were able to get around 40
Not sure about real-world statistics, but the current IETF position is that SMTP STARTTLS for mail submission (not transport) is to be phased out in favour of “implicit” SMTP-over-TLS with no cleartext portion, due in part to the former being an implementation minefield[1].
AFAIK, kTLS and hardware TLS offload don't solve the latency problem anyways. Those only handle the AEAD and record encapsulation/decapsulation in the critical path, where maximizing throughput is the concern. Control messages are not handled, so session establishment with client hello, cipher exchange, key exchange etc. is still done entirely in userspace, and the handshaking process is where the latency issues arise.
Oh yeah, QUIC is an improvement in terms of latency and even throughput over TCP/TLS. The claim that "QUIC is horribly inefficient" centers around CPU utilization and the cost of delivering each byte. That's where hardware offload shines but it doesn't exist for QUIC yet.
The only reason QUIC delivers throughput improvements over TCP is because it uses the same congestion control as BBR, so occasional packet loss doesn't kneecap the connection. If you use BBR TCP (or RACK TCP) on FreeBSD, you'll see the same improvement in throughput vs older TCP congestion control.
The same server that will do 375Gb/s at close to 50% idle will maybe deliver 70-90Gb/s of QUIC with the CPU maxed.
TLS+TCP delivers that performance with a lot of optimizations that just don't exist for QUIC:
- inline TLS offload (Mellanox CX6DX)
- async sendfile
-- note, the above 2 mean that the kernel never even maps data from a file being sent to a client into memory, much less copies it to/from userspace like with QUIC. This cuts memory bandwidth to roughly 1/4 of what it would be with a traditional read/encrypt/write to socket server.
- TCP segmentation offload
- TCP & IP checksum offload
- TCP large receive offload
Some of these optimizations are slowly becoming available, but until all are present, QUIC will cost 2x - 4x as much to serve as TCP for a CDN workload.
There's a couple of "previously connected" bits with QUIC:
- The very first connection to a site is usually HTTP/2, which requires an additional RTT compared to QUIC, as the browser doesn't yet know if the server supports QUIC. In the response, the server can advertise the presence of QUIC support with the Alt-Svc header. This support flag can also be present in a HTTPS DNS record for the domain but that isn't queried by all browsers yet. Future connections to the same server will default to QUIC once the browser is aware of support, saving an additional RTT.
- Once connected to a QUIC server, the encryption keys can be cached for future connections, allowing the client to send data with the initial connection request (so called 0-RTT). This is only safe for idempotent requests though as this part of the protocol could be replayed by an attacker.
You're right that HTTP/3 requires Alt-Svc at the moment. QUIC itself doesn't require a pre-established connection (1-RTT), which is notable for non-HTTP/3 protocols and WebTransport.
HTTP/3 seems to offer all these benefits already... And seems to be simpler and more compatible... And doesn't require a new DNS field which will surely trip up plenty of middleboxes...
Without taking a position on TurboTLS versus H3...
There are actually two types of middlebox problems:
1. DNS resolvers which don't carry new record types.
2. Middleboxes which don't properly handle UDP-based protocols
(2) applies to both H3 and to TurboTLS, and in both cases you need some kind of fallback in case things fail.
(2) applies to TurboTLS as specified. However, it's worth noting that H3 also has a DNS-based mechanism for advertising support via the HTTPS record (that is also used for ECH). However, you can also advertise H3 support via Alt-Svc, so presumably you could do the same with TurboTLS.
In general, any new transport like H3 or TurboTLS has to be offered on a best-effort basis with a fallback, otherwise you'll have a lot of hard failures.
> However, you can also advertise H3 support via Alt-Svc, so presumably you could do the same with TurboTLS.
Alt-Svc is an http header, by the time you get http you already have TLS setup. In a Alt-Svc workflow you generally advertise h2 in TLS ALPN, then use Alt-Svc to advertise h3 capability in the headers of the first response, and then the client establishes a h3 connection and closes the h2 connection when it is ready (and the h2 connection does not have any queued requests). At least that is my understanding.
Yes, that's correct. Basically the only good way to have a "first contact" fast track setup like this is by priming in DNS. My point is that the situation is th same for TurboTLS and QUIC/H3 in this respect.
I just meant that QUIC/H3 is beneficial even after you already have a connection, so upgrading an existing connection is a valid strategy. TurboTLS is only beneficial for establishing the connection, so it needs to have prior knowledge about the support to be useful at all.
HTTP/3 is based on QUIC, and these perf properties are due to QUIC, not H3.
With that said, the unusual performance dynamics of the Web make connection setup latency particularly important by contrast to, for instance, SMTP, where people don't really get bent out of shape if a message takes another 200ms to be delivered.
Not really, provided some data is already known, which is the case in TLS has well (either you know the public key of the server, or you trust a set of CAs)
"Secure communication in a single round-trip" implies securely transmitting to a specific audience without any correspondence beforehand and securely receiving information from them afterwards - which seems impossible - probably because it is.
If you relax the constraints to allow for shared state beforehand (which could only arise from prior communication-trips of some sort), you're just at RSA: cool to be sure, but coming up on 50 years old at this point.
As you say, if you want to have 0-RTT data, you must have some prior information about the server. This information can come in two main forms.
1. Have the client know the server's public key in advance. You can then do a RSA or ephemeral-static key exchange, as in gQUIC, OPTLS, or TLS Fasttrack.
2. Have the client and the server do an initial handshake and then reuse the symmetric key for future connections, as in TLS 1.3 and IETF QUIC (which uses TLS 1.3).
The public key version has a number of superficially compelling properties, in particular that you can publish the server's data somewhere like DNS ("0-RTT Priming") and thus have 0-RTT with the first connection. By contrast with the symmetric version you have to have a connection first. The public key mode also will work in this setting. However, making this work in practice has a number of challenges around authentication, anti-replay, anti-amplification, etc. Initially TLS 1.3 had both of these modes but we ultimately removed the public-key based one in favor of a symmetric-key based mode.
I’m not sure what’s the relation with RSA but you’re right that it’s simple. And that’s what noise has been doing for years. There isn’t much else to look at here, unless you specifically need tls you should be using noise
TLS 1.3 and QUIC both support 0-RTT modes as well (as noted in the original paper). Note that anything that runs over TCP of course first has to absorb the TCP round-trip (modulo TFO) whatever crypto it uses. That's why QUIC uses UDP, and the motivation for the use of UDP in TurboTLS.
1. See the exceptions section. Less is preferred in this construction.
2. Please do not misconstrue the opinion of one writer that lived 200 years ago into a proper grammar rule (see the history section). Worse, please do not be dogmatic about it when it has nothing to do with the topic. It is, in essence, the equivalence of an ad hominem attack.
Interesting. I wasn't aware of that. I had a colleague who constantly "corrected" us (with some humour) and even long after he left the team we all wince when we see/hear "less" when it should be (or so we believed), "fewer". I preferred it before when I blindly said "less" but didn't always hear a voice in my head commenting on everyone else's use