I was in the IETF meeting. It was actually a very civil discussion, and I think almost everyone in the room could see both sides on the issue. On the one hand, we've seen increasing problems with middleboxes making assumptions about end-to-end protocols, and this making it very hard to then deploy new end-to-end functionality. The problem is real; we measured this effect in this paper:
Indeed, one of the changes the IETF group has made since taking on QUIC is to also encrypt the QUIC sequence numbers, so middleboxes can't play silly games by observing them (they were already integrity protected).
However, the flip side is that network operators do use observations of network round trip time gained from passive observation of traffic so as to discover if traffic is seeing excessive queuing somewhere. An inability to do this with QUIC may either lead to worse network behaviour, or to them using other methods to gain insight. If you're in a location where you can observe both directions of a flow, you can easily do this by, for example, delaying a bunch of packets by 200ms, then letting them go. When you see a burst of (encrypted) ack packets return, you can deduce the RTT from your observation point to the destination. I'd really like to avoid operators thinking they need to do such hacks just to measure RTT, and the spin bit lets a passive observer see the RTT.
In the end, I was not convinced entirely by either argument, and neither was the consensus in the room. It's not a clear-cut decision; there are reasonable arguments either way. Such is engineering.
You may have done this already, but I encourage you to copy/paste this as a comment on the article at apnic.net so it has more visibility. The opening line about "It was actually a very civil discussion, and I think almost everyone in the room could see both sides on the issue." needs to be shared.
The thing I don't understand is why these network operators care about RTT outside their network? Shouldn't they only care about getting packets through their network as quickly as possible, which you don't need this spin bit for.
The global optimum might not be the combination of locally optimal routes. The shortest route through a network segment to a peer may end up being globally slower than a locally slower route to another peer which can handle the traffic faster.
That's fine. If global optimization turns out to be significant, we can do it at the edge with overlay networks. No need for network operators to be involved.
Realistically, are network operators going to add artificial delay to packets going through their network? If they don't get their bit, and they start slowing down communication, it would seem like market correction would solve it.
Assume there are a minimum of three network operators involved in every flow; source, destination, and transport. It is highly likely that either the source or destination operator will have opportunity and motive to do said inspection/detection.
Sarcastically: who will even notice a 200ms hiccup, nowadays, given the enormous waste of time that everybody shoves into their Javascript ad networks.
Personally, I don't want them to expose anything. The RTT on my packets is yet another piece of metadata capable of being abused.
In addition, I hope that the whole QUIC thing is a nice reboot of actually getting end-to-end connectivity back on the Internet so we can get some protocol experimentation moving again.
Sounds like AMP. Improve performance to fit more bloat instead of not adding superfluous elements in the first place. What else would you expect from an advertising company?
Aren't network operators already adding artificial delay? I pay for X Mbps download speed, and my ISP artificially limits my download speed to that, even though their network is perfectly capable of providing me with more.
No. Bandwidth limits and delay aren't the same thing.
CPE (the WiFi router or similar in your home) often adds delay because a few megabytes of RAM is cheap and the people who built it don't understand what they're doing. This is called "buffer bloat". But inside the network core this rarely comes up.
You need buffers to maintain high bandwidth. There is no way around that. The problem is lack of AQM in end user routers - where a single high bandwidth flow can fill up and hog those buffers, disrupting latency sensitive flows. We're slowly seeing some adoption of things like fq_codel, but it's not perfect since the user has to go and manually enter their upload/download speeds (which again are not easy to determine, especially for ISPs with "boost" limiting). Ideally home routers would dynamically adjust based on observed latencies.
Codel is knobless, there literally aren't parameters you _can_ set let alone ones you need to set according to your "upload/download speeds". So, I have no idea what you're tinkering with that you think needs to know "upload/download speeds" or why, but it's nothing to do with Codel.
fq_codel alone doesn't solve buffer bloat. There are still huge buffers upstream. You need to combine it with local rate limiting so you can control the buffer in your local home router. fq_codel/cake work by looking at time a packet spent in a local queue. If you don't rate limit locally, then your local queue is always empty and everything queues upstream. Modern routers with things like "Dynamic QOS" that use fq_codel all require providing downstream/upstream values. This is way simpler than more traditional qos, but it's still a barrier to wide spread adoption.
The only _sensible_ thing you can do if the transmitter won't stop is to ignore them, and thus drop packets, yes.
Queueing them up instead makes some artificial benchmark numbers look good but is a horrible end user experience, so you should never do this, but lots of crap home WiFi type gear does.
So, as I said, bandwidth limits and delay are different. The canonical "station wagon full of tapes" is illustrative, it has _tremendous_ bandwidth but _enormous_ delays. In contrast a mid-century POTS telephone call from London to Glasgow has almost no delay (barely worse than the speed of light) but bandwidth is tightly constrained.
I would far rather my udp packet is delayed by a millisecond than dropped.
I have an RTP stream running from India to Europe at the moment, 3000 packets per millisecond, so 0.33ms between pacekts. Typical maximum interpacket delay is under 1.5ms, looking at the last 400,000 seconds of logs, 200k are 1-2ms, 170k are 0-1ms, and about 4-5k on the 2-3ms gap, 3-4ms gape, etc. Less than 1% does interpacket delay increase past 10ms.
Standard SMPTE FEC doesn't allow more than 20 columns of error correction, at 30mbit that's about 7ms of drop, at best of times (assuming the required FEC packets aren't lost as well). At low bitrates it works better, and the majority of our international vision circuits rely on single streams with FEC. Had one ISP in the Ukraine that had an intermittent fault where, regardless of the bitrate, they would occasionally drop 170ms of traffic.
I currently have a difficult provision that for a variety of reasons I can't use ARQ on. To keep the service working I have 4 streams going, over two routes, with timeshifting on the streams to cope with route-change outages that tend to sit in the 20ms range. FEC is meaningless at these bitrates.
RTP is fine for delay and re-orders, but it doesn't cope with drops. I was at a manufacturer's earlier this week and said that I've experienced dual streaming skew of over 250ms (we had one circuit presumably reroute via the US), and I laugh at their 150ms buffer. Dual streaming can still fail when you have both streams runnning on the same submarine cable though. Trust me, intercontinental low latency interoprable broadcast IP on a budget isn't trivial
If you are running over UDP, that is so that packets get dropped while preserving the overall "stream". That is actually the purpose for the design of RTP over UDP.
You realise that streams can't cope if packets are lost -- how badly they are affected depends on how many packets are lost and which packets they are, in some cases a single lost packet can cause actual on-air glitches
The need to protect against this type of problems is the reason for which Forward Error Correction is included in almost all the codecs.
Moreover, since your specific use case is not interactive conferencing, but IPTV, there would be no problem in incrementing even more the FEC ratio, at the cost of a small decoding latency.
Standard SMPTE 2022-1 FEC does not cope with real world network problems, that's why we have things like 2022-7, but even then that struggles in the real world.
And yes, this use case is interactive conferencing, where we aim to keep round trip delay from Europe to Australia down below 1 second.
Bufferbloat is potentially almost unlimited, assuming that the people who built it are idiots (a safe assumption for most consumer gear) it basically just depends how much they were willing to spend on RAM.
For example, let's say we can move 10Mbps, and we've decided to use 10 megabytes of buffers to make our new WiFi router super-duper fast. Do a big download, the buffer fills with ten megabytes of data, that's eight whole seconds of transmission, now the latency of packets is eight seconds, so that's 8000 times larger than your "I assume less then 1ms".
Back around 2006, I had a RAZR that I figured out how to use to tether my computer onto Verizon’s 1xRTT service. I’m not sure which part of the system was to blame, but something had massive buffers and a strong aversion to dropping packets. If the connection got saturated I could easily see ping times of two minutes or more.
Why would a single big download fill the buffers? Router → LAN is typically an order of magnitude more bandwidth than internet → router, so shouldn't the buffer be emptied faster than filled?
The bottleneck is at your ISP's CMTS or DSLAM and your modem. e.g. The DSLAM has 1 Gbps in and only 40 Mbps down the line to your VDSL modem. Or your cable modem has access to 600 Mbps of capacity but your plan is only 100 Mbps so the modem limits. So there's quick stepdown: 1 Gbps, 600 Mbps, 100 Mbps.
For downloads it's buffers in your ISP's hardware that matter. For uploads it's your router's egress buffer.
e.g. You are syncing gigabytes to Dropbox. A poorly designed router will continue to accept packets far past upstream capacity. Now that's there's 2000 ms of bulk traffic in the router's queue, any real time traffic has to wait a minimum of 2 seconds before getting out.
Depends on the router, and the load. If you have a 100mbit uplink and 4x1 gig on the downlink side, you could easily have 10mbit of packets arrive in 3ms (70 packets per millisecond), but would take 30ms to send those packets. You can either
1) Drop -- despite total trafic being only 10mbit a second
2) Queue -- introducing a delay of 30ms.
In reality you'd put latency critical applications (voip etc) at the top of the queue so the pcakets get transmitted without the delay, and your facebook packets get delayed by 31ms rather than 30ms.
As much as 2000 ms when saturating the downlink with a long file transfer. Web browsing stops working and nothing loads in as every little resource request takes two seconds.
I've no idea what techniques will actually be applied by operators - I never expected middleboxes to go to the extremes they currently do either. But the technique I mentioned, of briefly delaying a short burst of packets, if done only occasionally, isn't really going to be noticed during a large transfer. I'm not claiming its a good idea though!
Large transfer's wouldn't be the place where the effect would be noticeable, it would be JSON payloads that can fit in a few packets. If operators would do this seldomly enough to not be noticed, why do they need the bit at all? It feels like a blank (albeit single digit) check.
To me, excessive queueing seems indicative of network operators making assumptions about traffic and their solutions ultimately lead to worse outcomes, buffer bloat etc. Isn't QUIC meant to prevent the network from making these assumptions, forcing a more agnostic network, and letting QUIC manage congestion, latency, etc?
I think the point is that network operators want to use QUIC to measure network segments and make routing decisions. They are not necessarily trying to improve QUIC performance.
Middleboxes have been tampering with packets and headers for way too long. It's most definitely time to move back control to the two ends of the connection, to avoid fossilisation and the same mistakes made in the past. IPv6 and QUIC are a cardinal step in this direction, and we cannot afford to let myopic decisions screw up the opportunity to fix what has been broken for so long.
There are still environments where active policies described by network operators to manage traffic flows is _required_ for users to make any productive use of the network ... I used to work on research ships — 50 people at sea for months at a time with satellite internet having only 256kbps (800ms latency).
It was abysmal to watch as over time everyone became completely dependent on internet connectivity for their computing needs as all services moved to the cloud. Trying to provide a useable internet experience in that context involved blocking as much as we could that users machines were doing (usually without their awareness) in the background that simply couldn’t happen from that network — making the trade offs to try and select for
things that _could_ work was a huge challenge.
It seems like QUIC offers better fundamental performance primitives but no way to solve for protecting users from themselves in this kind of environment ...
The"correct" answer there is probably to make the network limits explicit, by explicitly requiring a proxy server or organization CA installed in the client. QUIC should still work with middleware that actually terminates the connection instead of meddling from outside.
My takeaway from the meeting was that it isn't clear that the spin bit is actually useful. Leaving aside the arguments that RTT shouldn't be exposed (it is anyway, with the handshake): does the spin bit actually provide useful information?
It seems to be a fairweather metric: OK resolution when the network is operating normally, but providing no useful information when something has gone wrong (which is when you'd want it most).
Nice. I especially like this bit: "Sometimes difficult choices admit to no sensible compromise between them and the process simply has to make a contentious decision one way or the other. While individuals make such decisions all the time, the collective process that enlists a large group of diverse perspectives, interests and motivations finds such decision making extraordinarily challenging."
> The client echoes the complement of the last seen bit when sending packets to the server.
Since neither the client or server care about this big, there doesn't appear to be anything that forces the client to actually implement this behavior. The client could always set it to 0 or even set it (pseudo-)randomly on each packet.
The bit shouldn't exist, but if the IETF did add it to the standard, would hostile middleware boxen actually start drooping packets if they don't see the "spin bit" change?
This was my first thought as well. I would love to see the reaction from the pro-spin-bit camp if Google announced that Chrome will never set the spin bit.
As someone only superficially familiar with networking, I think it would be interesting to know what impact the current packet inspection practices by the middleboxes have on network performance. After all, the article mentions that those middleboxes want to control the message flow and use that as justification. The assumption behind QUIC seems to be that there's no "global" benefit in that specific kind of network management and that it's mostly selfish interests of local network operators that motivate messing with the packets?
Basically, what are the incentives of the middleboxes to inspect packages and what do they really stand to lose?
I probably expressed that wrong, I was more wondering if the packet sniffing had any beneficial impact on the performance in the sense of QoS or congestion control or something like that. After all, they have to do it for a reason.
But your second points mentions it's for billing and such, so I guess that's my answer.
I don’t think it’s for billing most of the time. The middleboxes usually provide some kind of immediate benefit to the network, while breaking core networking assumptions and thus contributing to a weird form of technical debt.
>"Many network operators use the IP and transport packet headers to perform traffic engineering functions, packet interception and forced proxy caching."
I'm guessing this referring to transparent hardware caches looking at Layer 7/HTTP headers?
And then:
"Various forms of middleware may reach into the TCP control fields and manipulate these values to modify session flow rates"
Does the author mean TCP flow control here? So there exists middleware which changes the ACK and RWND values in TCP headers? Does anyone know what middleware vendors do this? I'm guessing this might be done as as part of network "accelerator" hardware devices like Riverbed. Is this correct?
I am not sure I understood the article to be honest.
Does QUIC encrypt the actual UDP packet (meaning from IP down, including the UDP header)? If that is the case it is not UDP anymore and there will be "problems" in getting to anywhere in the first place.
If it only encrypts the UDP payload, where does NAT come in? How would that be different from what TLS does over TCP?
The article only mentions NAT as a showcase of what IP packet parsing is used for. QUIC does not encrypt the UDP header.
TLS 1.3 actually run into a lot of issues due to network ossification. QUIC has been developed to bypass TCP's ossification problem and enable quicker iteration and deployment to improve latency, congestion mitigation etc.
You could argue that if QUIC's principles were in place for TCP/IP 20 years ago, middleboxes would not have made protocol revamps as difficult as they are today. Perhaps NAT would have never been developed and we'd all be on IPv6 already.
> If it does not tamper with UDP I fail to recognise the issue though to be honest.
the problem is that if you expose any kind of information in addition to just the IP and UDP headers, then future middleboxes will start to use this information and will start dropping packets they can't parse.
If a QUIC packet is just random noise (aside of the UDP and IP headers), then a middlebox can't make assumptions about the inner workings of QUIC. It's a very common (but unfortunate) practice by "security" appliances to drop everything they don't understand because they treat that as potentially malicious.
Let me make a (hypothetical) example: Let's say QUIC packets had some publicly available "version" field. The implementation as it's used now is setting that to 1.
Now middleboxes start "gaining" support for QUIC and as "everybody" knows, the only widely deployed version is 1, so these middleboxes start to treat every other value of that field as a possible attack and drop such packets.
Now, years later, we want a new version of QUIC, but unfortunately, the most widely deployed (and never updated) middleboxes out there assume any version but 1 to be malicious.
Which leaves us with a "version" field that practically has to be set to 1, so now we need another way to flag the new packets. Maybe a "real_version" field? Who knows? We'll have to try various things until the majority of the middle-boxes currently deployed are fooled.
Also, it will likely be impossible to fool them all, so even when we get around the majority of the boxes, we'll still exclude some people from being able to reach QUIC 2 servers. Sure - it will be a very small amount, but it won't be zero.
This isn't just theoretical. We had this problem just now with TLS 1.3. Since the beginning SSL and then TLS had version fields in order for clients and servers to negotiate the SSL version to use.
Unfortunately, because of precisely this problem, that field stopped being useable years ago where even the 1.2 negotiation had to happen using a workaround which then promptly also stopped working for 1.3.
By not exposing anything but random noise as part of a QUIC packet, the protocol designers aim to prevent this from happening. If all a "transparent" proxy sees is random noise, they can decide to not support QUIC at all or to support all of it. They can't decide on a "safe" subset and burn that into the internet for all eternity.
The discussion about exposing the bit was to allow network administrators to detect retransmissions. The bit is supposed to flip constantly, if it doesn't, then retransmissions happen and thus something might be wrong with the network.
Because it's just one bit and it can have two values and both are valid and both values are seen with about the same frequency, a middlebox won't be able to just drop a packet if the value is either 0 or 1.
This is why there is even a discussion happening. It's felt to be reasonably safe to include to provide some actual value to tools.
The debate is though whether it's really safe and/or whether it provides enough value to go through the trouble.
If you ask me personally, in my professional life, I have been bitten by protocol ossification way more than by not being able to make sense out of a packet stream, so I personally would absolutely not expose anything.
But then again, I'm an application level developer and not a network administrator.
You do know that someone somewhere is going to make a middlebox that just drops a packet unless that bit flipped in the precise sequence that the middlebox developer believed was the correct one, right?
Then someone proposes an enhancement to QUIC which happens to change the sequence of the flips (perhaps some multipath thing, or an enhancement in the way it treats reordered packets), and it breaks...
The article only mentions NAT as the most prominent example of middle boxes. And no, the UDP header obviously is not encrypted, if it were it wouldn't work with existing devices and OSes.
UDP doesn't give you much information to mess with, which is part of why it's used for QUIC: it works through existing networks, and they can protect the deeper protocol layers against development of new middle-boxes attempting to look into them by encrypting all of it.
The problem is that if a protocol has visible parts, middleboxes will try to do stuff with those and not handle them correctly in corner cases, or when the protocol changes (e.g. TLS had issues with middleboxes attempting to verify details of the handshake and breaking or downgrading a connection when it saw options it didn't know, e.g. because a newer TLS version was used. MPTCP extended TCP and had lots of problems with things that expected specific behavior of TCP flags and parameters, where MPTCP would have liked to use them differently. With HTTP there have been issues with proxies not understanding new headers, or things like websockets. ...)
QUIC tries to prevent this by making as little as possible visible outside the encryption: it should look to a middlebox as much as possible like an opaque data stream, and not reveal any details about what's going on inside. Now there is a proposal to add something that is explicitly visible to the network, and people are worried that will come back to bite them in some way if they make an exception now.
This proposal is a compromise. Wouldn't a specific metadata/heartbeat "packet", routinely dispatched by the involved parties, serve a better solution?
My personal stand is to prohibit exposing any metadata on the connection whatsoever, but given the counterarguments, I don't think the proposed solution is ideal.
I disable QUIC in my network because I'm doing QoS at my router and utilize TCP flow control to slow things do to where I want them to be. QUIC makes the incorrect assumption that the network never wants to slowdown or to even allow for a configurable speed. Don't be too clever.
Why use flow control? Why not just a traffic shaper like altq/dummynet/whatever? Do you really only care about TCP traffic and not torrents, games, VoIP etc.?
QUIC uses TLS, so it's pretty much the same situation as any other TLS connection.
Of course in a corporate environment you can trust the middlebox's CA on all the work machines. That's actually fine. What everyone seems to hate here is middleboxes that don't terminate connections and try to mess with them in other ways.
A nice thing about the I-D system is that we don't have to waste RFC numbers on this sort of crank nonsense. Once upon a time the IETF would end up publishing this sort of thing as an RFC and then everybody would just ignore it, but that used up the numbers and short memorable numbers are nice, which was a shame. Now we can publish them as "drafts" which just quietly expire once their interest moves on to investigating which metal foils can be made into hats that best resist government mind control rays or whatever.
>to investigating which metal foils can be made into hats that best resist government mind control rays or whatever.
You joke, but Allan H. Frey already solved that in the 1960s, just add a 2"x2" wire mesh near the temples, over the temporal lobes. Blocks all RF interaction with the brain, no tin foil hat required.
...of course, this doesn't block 'mind control rays', the only thing it actually blocks consists of the Microwave auditory effect, and the only target audience for that consists of Radar techs walking around in front of hugeass antennas. ;)
Knowing the RTT seen by actual user traffic is useful because, for example, it lets you observe if that traffic is encountering excessive queuing. It's hard to know this by other means; for example you may think you can just ping the end systems, but these days ICMP is often filtered, and even it it wasn't, the test probes may be queued differently due to weighted fair queuing, or even take a different path due to equal-cost-multipath routing. In the end, the only true measure is what the traffic itself experiences.
Now, whether it is necessary for operators to know if user traffic is seeing excessive queuing is open to debate. But the fact is they do currently use this as one network health measure.
How though, because surely it includes the time outside the observers network? And if they only want to monitor performance inside their network nothing so them from using an out of band system, or even adding their own timing footers to packets.
- Google has been spearheading a protocol called 'QUIC'
- It bypasses tradition TCP by using UDP to create it's own version of TCP but encrypts the data it sends
- However it could be (is?) sending a bit and header that can make tracking the message path easier.
I would guess if that is true then and would not effect the performance the bit and header should be removed. I don't see what the fuss is about. Can't two different implementations exist along side each other and the public will choose the better one?
Correction: I misread the article and thought though spin bit was for an encryption key. Let this be a lesson to everyone: if you are going to comment at 2 am in the morning reread your article.
It doesn't make decrypting the message easier. It just exposes a bit of unencrypted information, that makes it possible to infer information like how the connection's round-trip time. And the argument is over whether exposing that information has value, and whether routers and other boxes in the middle will abuse that information.
IMO the article completely fails to describe the actual issue.
- "Then there is the NAT function, where the 5-tuple of protocol, source and destination addresses and the source and destination port numbers is used as a lookup vector into a translation table, and both the IP and the inner transport packet headers are altered by the NAT before passing the packet onward."
Nice try, but if the outer layer is UDP, then NAT alters the UDP packet headers. The QUIC payload is never touched.
- "Many network operators use the IP and transport packet headers to perform traffic engineering functions, packet interception and forced proxy caching. Various forms of middleware may reach into the TCP control fields and manipulate these values to modify session flow rates. All of these activities are commonplace, and some network operators see this as an essential part of their service."
I don't see why the people tasked with standardizing Internet protocols now have to make provisions for the jerks who literally break the Internet.
- "This bit, the “spin bit” is intended to be used by passive observers on the network path to expose the round trip time of the connection. The management of the bit’s value is simple: the server simply echoes the last seen value of the bit in all packets sent in this connection. The client echoes the complement of the last seen bit when sending packets to the server. The result is that when there is a continuous sequence of packets in each direction this spin bit is flipped between 0 and 1 in time intervals of one Round Trip Time (RTT). Not only is this RTT time signature visible at each end, but it is visible to any on-path observer as well."
I have literally no idea why a protocol should actively leak unnecessary information to a passive observer, and I have no idea what passive observers would do with the information they can deduce from this "spin bit". The payload is still encrypted, so you still can't do all the "traffic engineering, packet interception and forced proxy caching". As a passive observer you sit somewhere in the middle of the whole stream, and if you can even fish out the necessary packets (they might take different routes in each direction), you still don't know how far you are from the other end. Not even talking about the random processing delays at the server end. You're measuring garbage. How is this bit even useful for anything?
It seems you were the one who completely failed to understand the article. Your first point implies you think the author was talking about QUIC in that section, when instead the author was merely using NAT as an example of packet inspection to provide context to the topic. Your second point is in fact exactly the point of the entire article, which itself is a commentary on the decision making process and factors in that process that IETF is presently using to decide exactly how much to "break" the protocols, and moreover on the observation that many people in the IETF do not consider this to be "breaking" at all. Your third point is a commentary on the utility of the bit itself, which is not really the focus of the article anyway.
You claim:
"IMO the article completely fails to describe the actual issue."
The point is that QUIC is supposed to be fully encrypted below the UDP layer. The question is why add bits in the clear on top of this to water it down and open an avenue for potential abuse.
http://www0.cs.ucl.ac.uk/staff/M.Handley/papers/extend-tcp.p...
Indeed, one of the changes the IETF group has made since taking on QUIC is to also encrypt the QUIC sequence numbers, so middleboxes can't play silly games by observing them (they were already integrity protected).
However, the flip side is that network operators do use observations of network round trip time gained from passive observation of traffic so as to discover if traffic is seeing excessive queuing somewhere. An inability to do this with QUIC may either lead to worse network behaviour, or to them using other methods to gain insight. If you're in a location where you can observe both directions of a flow, you can easily do this by, for example, delaying a bunch of packets by 200ms, then letting them go. When you see a burst of (encrypted) ack packets return, you can deduce the RTT from your observation point to the destination. I'd really like to avoid operators thinking they need to do such hacks just to measure RTT, and the spin bit lets a passive observer see the RTT.
In the end, I was not convinced entirely by either argument, and neither was the consensus in the room. It's not a clear-cut decision; there are reasonable arguments either way. Such is engineering.