I'm surprised the article does not point out more directly that it's usually pretty simple to mitigate this attack vector by switching to mergesort. The worst case is O(n log n) so there's no real "killer input".
I think it's a good rule of thumb to say "if you need to sort large quantities of untrusted data you should probably use mergesort".
Well, sometimes you have space constraints. The O(n) extra-space required by mergesort isn't always available. But if you don't have space constraints, mergesort is a very good choice because you can easily write a highly scalable parallel version of the algorithm.
Looks like the algo is more complicated, but surely implementable. In most modern architectures, the key for performance is how one exploits the caching behavior. Quicksort is very cache friendly, as well as the common implementation of merge sort. Not so for heapsort.
No, but mergesort can easily fall back to dumping its sorted contents to disk, for later merging; this disk-based merge sort was invented by van Neumann in the 1940s, and is how the unix utility sort works (dozen-way merge of sorted shards of a file).
Mergesort would be a poor choice for qsort() due to the linear space complexity. With N bytes of RAM, you'd only be able to sort (a bit less than) N/2 bytes of data. An in-place algorithm is preferable.
Glibc is the only libc I'm aware of that implements mergesort. It still falls back to quicksort for large inputs though.
If you are ever handling input data from basically anything outside of the program. Its better to use a constant speed algorithm like merge sort. Just because of attacks like this, I'm really surprised algorithm exploits attacks like this haven't come to the forefront of attacks recently.
Application level DDoS's leave the kernel/network/sockets layers exposed for exploitation. Since their tasks will take priority over user's applications.
> Application level DDoS's leave the kernel/network/sockets layers exposed for exploitation. Since their tasks will take priority over user's applications.
So what's the benefit of that, just that you can maximize the chaos you cause by attacking in two different ways?
One attack and obfuscate the other. Like if you server is at 90% cpu use, and the vast majority of it is server application. You won't consider anything strange happening. You can keep attacking the system with strange algorithm exploits or stop. Nobody will expect an attack in the system since well last time I check the box was at 90% CPU, and it was the server. I'd check again, but maybe we just have a lot of traffic today or something similar.
Basically its the old, can't see the forest past the trees routine.
I think heapsort is a better replacement - in-place, guaranteed O(n log n) time (so also quite resistant to information leakage via timing disclosure, which could be important in higher security applications.)
I think it's a good rule of thumb to say "if you need to sort large quantities of untrusted data you should probably use mergesort".