A beautiful algorithm that only a very few know of: In-Place Merge in O(n) time

sblom · on Feb 27, 2012

This link got me the paper in PDF form with absolutely no hassle: http://web.eecs.utk.edu/~langston/courses/cs594-fall2003/HL....

(Appears to be the faculty web site of one of the paper's authors.)

yason · on Feb 27, 2012

Thanks for the link. For me, the article really didn't even exist since it wasn't freely available from the ACM site.

forrestthewoods · on Feb 27, 2012

Perhaps very few know of it because it's hidden behind a ridiculous pay wall.

dhruvbird · on Feb 27, 2012

Sorry about the pay-wall link guys :-( I'm in university so I didn't realize it was this way.

I've posted 3 other links which should allow free access to the paper. I think these things should be open (and free) to start off with.

PaulHoule · on Feb 27, 2012

This is precisely what is wrong with the Paywall -- people in the Uni don't know it is there.

Donald Knuth embarassed himself a few years ago when he told people that the library at the Uni was worthless because he can just sit as his computer and view all the literature for free.

Academics generally don't realize that the library at their Uni cuts a million dollar check to Elsevier every year, in addition to a number of other publishers. As a result, they don't see that there is any problem.

bunderbunder · on Feb 27, 2012

The other problem is that it cuts academics off from the rest of the world. Because paywalls effectively prevent non-academics from seeing what academics are doing, they end up being underappreciated, and their contributions end up being underutilized. And, since we might as well bring it back to money, I suspect it ultimately has a downward influence on the amount of money legislatures earmark for universities and research.

PaulHoule · on Feb 27, 2012

you took the words right out of my mouth!

unfortunately when academics are indifferent about marketing themselves to the wider world, they are the losers in the long term.

bhickey · on Feb 27, 2012

I came across this paper about five years back. At the time I sent an e-mail to Professor Langston thanking him for his insights and clarity in explaining them. Here's his reply:

    Thanks for your kind note Brendan.  It's been over
    twenty years since we wrote that paper.  One never
    knows which works will survive the test of time.

pwg · on Feb 27, 2012

Some source - with a lot of comments - that claims to implement the algorithm:

http://www.keithschwarz.com/interesting/code/?dir=inplace-me...

jemfinch · on Feb 27, 2012

That's the same guy that did the fantastic writeup of Dijkstra's Smoothsort: http://www.keithschwarz.com/smoothsort/

ggchappell · on Feb 27, 2012

There's an issue here that I really don't understand: why is this never used?

For example, the original C++ STL generally did a heck of a good job at exposing an interface to the 1997 state of the art in general-purpose algorithms. So, the spec for std::stable_sort was aimed at Mergesort: O(n log n) comparisons and O(n) additional space due to the space required by the Stable Merge. HOWEVER, if limited space is available, then it is allowed to use O(n log n log n) comparisons, since a slower in-place algorithm is used. [1]

Why? With the algorithm from this article, we can do an O(n log n) in-place Mergesort. I imagine it's not as fast as the standard algorithm that requires O(n) additional space -- although the abstract of this article does call the in-place merge algorithm "reasonably competitive". But it certainly sounds like it would be a better fall-back option.

Similarly, lots of languages have moved toward some version of Mergesort for their standard sorting algorithm: Perl, Python ("Timsort" is a Mergesort variant). Even some versions of the "C" Standard Library's qsort are implemented with Mergesort these days. I never hear about any of these being in-place. But the work in this article isn't exactly a secret.

So: I don't get it.

[1] EDIT: I checked the C++11 spec. It is the same. std::stable_sort does O(n log n) comparisons if sufficient memory is available, and O(n log n log n) if not.

jemfinch · on Feb 27, 2012

"For the sake of speed and simplicity, our algorithm is not stable." There's your answer.

In their defense, the authors do include a footnote to another of their papers adding stability and preserving the asymptotic properties, but admit that it's significantly slower. The answer then becomes, "Because mandating the lower complexity bound would increase the real world costs." It's the same reason the standard only requires nth_element to be average case linear, rather than worst-case linear; there exist worst-case linear algorithms (like CLR's median of medians algorithm) but their real world performance is far worse than quickselect.

ggchappell · on Feb 27, 2012

Ah, I didn't catch the lack of stability. Thanks.

OTOH, that does not completely address my difficulties. Because there are fall-back algorithms to consider. And so it seems to me that the slower version of Mergesort that is both stable and in-place, still has practical value.

You mention that fast selection typically uses Quickselect, since the known linear-time selection algorithms have much worse average-case performance. That's true; however, e.g., in Musser's 1997 paper where he introduces Introsort (log-linear-time Quicksort variant), he also recommends doing selection by starting with Quickselect, then switching to a linear-time algorithm if the recursion depth exceeds some threshold. The result is a linear-time algorithm with the same average-case performance as Quickselect.

As far as I know, Musser's idea is the standard selection algorithm in use today. Or if it isn't, then it probably should be. [1]

Similarly, I have no problem with C++ std::stable_sort using linear additional space. But if it can't, then why not fall back to another log-linear-time algorithm, instead of the O(n log n log n) that is allowed for in the standard? This stuff has been known for decades. It wasn't taken into account in the 1998 standard, and it still wasn't taken into account in the 2011 standard. Why not?

I kinda feel like I'm beating a dead horse here ... but in any case, there are still unanswered questions.

[1] EDIT: Checked the C++11 standard again. The spec for std::nth_element has not been updated to reflect Musser's idea; it's still "Complexity: Linear on average." OTOH, the spec for std::sort has been updated. In 1998 it was "Approximately N log N ... comparisons on the average," while in 2011 it was simply "O(N log (N)) ... comparisons."

dhruvbird · on Feb 27, 2012

I agree. I too keep wondering "why?"

warmfuzzykitten · on Feb 27, 2012

I kind of stopped reading after I learned the algorithm requires O(sqrt(n)) blocks each of O(sqrt(n)) size. Since sqrt(n)*sqrt(n) = n, how is this an improvement over the standard technique that simply uses an additional O(n) space?

jemfinch · on Feb 27, 2012

You misunderstand. The algorithm chunks the existing input into O(sqrt(n)) blocks of O(sqrt(n)) size. That's why they call it an "in place" merge.

batista · on Feb 27, 2012

I kind of stopped reading

Seldom is a comment beginning with this insightful. Why stop or "kinda stop", and then criticize? Only criticize if you have read FULLY.

Hint: "in place".

warmfuzzykitten · on Feb 27, 2012

I deserved that. Thanks also to jemfinch for more constructively pointing out my error. It happens from time to time: one's mind gets stuck on a detail, in this case a misunderstanding, and starts rejecting, which kills learning dead in its tracks. This all swims in a sea of "half as fast" and "not order-preserving" which further reinforces rejection - after all, if one wanted a non order-preserving in-place sort, there are faster ones. None of this is in any way a criticism of the paper but more a reflection on the state of mind that leads sometimes to being less than insightful.

chmike · on Feb 27, 2012

Please do not submit articles behing pay walls.

petercooper · on Feb 27, 2012

Being behind a paywall isn't helping with its publicity. On the plus side, this reminds me I really do need to sign up for the ACM! :)

icebraining · on Feb 27, 2012

Searching the title returns this: http://akira.ruc.dk/~keld/teaching/algoritmedesign_f04/Artik...

sold · on Feb 27, 2012

I get 404 at this URL but google cache is available.

dhruvbird · on Feb 27, 2012

Google Cache Link: https://docs.google.com/viewer?a=v&q=cache:n-YwxuJHeAoJ:...

icebraining · on Feb 27, 2012

My bad. I've actually clicked the "Quick View" link; the possibility of being unavailable didn't cross my mind.

dhruvbird · on Feb 27, 2012

Or you can try this one if neither work: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.88.1...

on Feb 27, 2012

[deleted]

petercooper · on Feb 27, 2012

Maybe, but it's just another paywall, selling a PDF for $39.95.

dhruvbird · on Feb 27, 2012

Does this link work? http://www.deepdyve.com/lp/association-for-computing-machine...

snth · on Feb 27, 2012

My usual way around this is to search on scholar.google.com. This usually turns up some free versions. It does in this case.

mdonahoe · on Feb 27, 2012

Can someone writeup a gist of the algorithm?

k_l_s · on March 1, 2012

Given two sorted lists A and B concatenated as a single list AB:

Divide the list up into sqrt(n) pieces, each of size sqrt(n). Create one such piece consisting of the sqrt(n) largest elements of the combined list and move this piece to the left-most position (call this the buffer piece), while the remaining pieces are simply formed from the remaining elements. Note that the remaining pieces will retain their ordering from the fact that A and B were originally sorted (as long as care is taken not to cross the boundary of A/B), while the left-most piece need not be sorted at this point. Stable? [I didn't see it in the paper, but I think it's necessary] sort the sqrt(n)-1 non-leftmost pieces in non-decreasing order by their tail element. Finally, merge (details left out as to when to stop the merge) the longest sequence of pieces constructable from the beginning of the un-merged list so that the tail element of any piece is not greater than the head element of it's downstream piece with this sequence's downstream piece, using the left-most piece as the buffer and swapping the min element of the two lists into the min buffer position instead of overwriting the buffer element. After a merge, the buffer piece will remain as a single piece and may be re-used as the buffer for the next merge. In the end, simply sort the buffer piece. Notice that we may use an O(n^2) sort for the two sorts that we need to perform without breaking the merge's linear time property since in both cases we're sorting sqrt(n) items.

The min element of the remaining pieces will always make its way into the min buffer position a sorted lists' pieces will not be re-ordered by the tail-element sort.

Anyhow, many details left off, but you asked for a gist and I believe that this qualifies as a gist.