As someone who needed high throughput and looked to QUIC because of control of buffers, I recommend against it at this time. It’s got tons of performance problems depending on impl and the API is different.
I don’t think QUIC is bad, or even overengineered, really. It delivers useful features, in theory, that are quite well designed for the modern web centric world. Instead I got a much larger appreciation for TCP, and how well it works everywhere: on commodity hardware, middleboxes, autotuning, NIC offloading etc etc. Never underestimate battletested tech.
In that sense, the lack of TCP_NODELAY is an exception to the rule that TCP performs well out of the box (golang is already doing this by default). As such, I think it’s time to change the default. Not using buffers correctly is a programming error, imo, and can be patched.
Yes. That issue is confusingly titled, but consists solely of a quote from the author of the code talking about what they were thinking at the time they did it.
The specific issues that this article discusses (eg Nagle's algorithm) will be present in most packet-switched transport protocols, especially ones that rely on acknowledgements for reliability. The QUIC RFC mentions this: https://datatracker.ietf.org/doc/html/rfc9000#section-13
Packet overhead, ack frequency, etc are the tip of the iceberg though. QUIC addresses some of the biggest issues with TCP such as head-of-line blocking but still shares the more finicky issues, such as different flow and congestion control algorithms interacting poorly.
Quic is mostly used between client and data center, but not between two datacenter computers. TCP is the better choice once inside the datacenter.
Reasons:
Security Updates
Phones run old kernels and new apps. So it makes a lot of sense to put something that needs updated a lot like the network stack into user space, and quic does well here.
Data center computers run older apps on newer kernels, so it makes sense to put the network stack into the kernel where updates and operational tweaks can happen independent of the app release cycle.
Encryption Overhead
The overhead of TLS is not always needed inside a data center, where it is always needed on a phone.
Head of Line Blocking
Super important on a throttled or bad phone connection, not a big deal when all of your datacenter servers have 10G connections to everything else.
In my opinion TCP is a battle hardened technology that just works even when things go bad. That it contains a setting with perhaps a poor default is a small thing in comparison to its good record for stability in most situations. It's also comforting to know I can tweak kernel parameters if I need something special for my particular use case.
Many performance-sensitive in-datacenter applications have moved away from TCP to reliable datagram protocols. Here's what that looks like at AWS: https://ieeexplore.ieee.org/document/9167399