Hacker News new | past | comments | ask | show | jobs | submit login

I'm surprised the article does not point out more directly that it's usually pretty simple to mitigate this attack vector by switching to mergesort. The worst case is O(n log n) so there's no real "killer input".

I think it's a good rule of thumb to say "if you need to sort large quantities of untrusted data you should probably use mergesort".




I've always liked mergesort for this reason. It's easier to understand and there's no wondering about complexity.


Well, sometimes you have space constraints. The O(n) extra-space required by mergesort isn't always available. But if you don't have space constraints, mergesort is a very good choice because you can easily write a highly scalable parallel version of the algorithm.


You can implement merge sort in place too:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22....

Looks like the algo is more complicated, but surely implementable. In most modern architectures, the key for performance is how one exploits the caching behavior. Quicksort is very cache friendly, as well as the common implementation of merge sort. Not so for heapsort.


"The O(n) extra-space required by mergesort isn't always available."

Heapsort


No, but mergesort can easily fall back to dumping its sorted contents to disk, for later merging; this disk-based merge sort was invented by van Neumann in the 1940s, and is how the unix utility sort works (dozen-way merge of sorted shards of a file).


Mergesort gets pretty damn efficient, even with moderate buffer space.


Mergesort would be a poor choice for qsort() due to the linear space complexity. With N bytes of RAM, you'd only be able to sort (a bit less than) N/2 bytes of data. An in-place algorithm is preferable.

Glibc is the only libc I'm aware of that implements mergesort. It still falls back to quicksort for large inputs though.


I just heard about smoothsort: http://code.google.com/p/combsortcs2p-and-other-sorting-algo...

Also, nice little write-up, thanks. I also dig the extracted sort implementations you dug-out: https://github.com/matslina/qsort


All BSD derived libcs contains mergesort(), and heapsort() as well.


True, but parent meant used in the qsort and qsort_r implementations.


Oh, of course. reading fail.


If you are ever handling input data from basically anything outside of the program. Its better to use a constant speed algorithm like merge sort. Just because of attacks like this, I'm really surprised algorithm exploits attacks like this haven't come to the forefront of attacks recently.

Application level DDoS's leave the kernel/network/sockets layers exposed for exploitation. Since their tasks will take priority over user's applications.


> Application level DDoS's leave the kernel/network/sockets layers exposed for exploitation. Since their tasks will take priority over user's applications.

So what's the benefit of that, just that you can maximize the chaos you cause by attacking in two different ways?


One attack and obfuscate the other. Like if you server is at 90% cpu use, and the vast majority of it is server application. You won't consider anything strange happening. You can keep attacking the system with strange algorithm exploits or stop. Nobody will expect an attack in the system since well last time I check the box was at 90% CPU, and it was the server. I'd check again, but maybe we just have a lot of traffic today or something similar.

Basically its the old, can't see the forest past the trees routine.

Its more gaming the operators then the system.


Ah okay, I wondered if it was that simple.


I think heapsort is a better replacement - in-place, guaranteed O(n log n) time (so also quite resistant to information leakage via timing disclosure, which could be important in higher security applications.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: