Hacker News new | past | comments | ask | show | jobs | submit login

"For the sake of speed and simplicity, our algorithm is not stable." There's your answer.

In their defense, the authors do include a footnote to another of their papers adding stability and preserving the asymptotic properties, but admit that it's significantly slower. The answer then becomes, "Because mandating the lower complexity bound would increase the real world costs." It's the same reason the standard only requires nth_element to be average case linear, rather than worst-case linear; there exist worst-case linear algorithms (like CLR's median of medians algorithm) but their real world performance is far worse than quickselect.




Ah, I didn't catch the lack of stability. Thanks.

OTOH, that does not completely address my difficulties. Because there are fall-back algorithms to consider. And so it seems to me that the slower version of Mergesort that is both stable and in-place, still has practical value.

You mention that fast selection typically uses Quickselect, since the known linear-time selection algorithms have much worse average-case performance. That's true; however, e.g., in Musser's 1997 paper where he introduces Introsort (log-linear-time Quicksort variant), he also recommends doing selection by starting with Quickselect, then switching to a linear-time algorithm if the recursion depth exceeds some threshold. The result is a linear-time algorithm with the same average-case performance as Quickselect.

As far as I know, Musser's idea is the standard selection algorithm in use today. Or if it isn't, then it probably should be. [1]

Similarly, I have no problem with C++ std::stable_sort using linear additional space. But if it can't, then why not fall back to another log-linear-time algorithm, instead of the O(n log n log n) that is allowed for in the standard? This stuff has been known for decades. It wasn't taken into account in the 1998 standard, and it still wasn't taken into account in the 2011 standard. Why not?

I kinda feel like I'm beating a dead horse here ... but in any case, there are still unanswered questions.

[1] EDIT: Checked the C++11 standard again. The spec for std::nth_element has not been updated to reflect Musser's idea; it's still "Complexity: Linear on average." OTOH, the spec for std::sort has been updated. In 1998 it was "Approximately N log N ... comparisons on the average," while in 2011 it was simply "O(N log (N)) ... comparisons."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: