Hacker News new | past | comments | ask | show | jobs | submit login
Sleep sort (2011) (4chan.org)
326 points by olalonde on Dec 26, 2014 | hide | past | favorite | 68 comments



Just to be 100% clear, in case anyone who is new to the field stumbles on this thread, this is NOT breaking math or anything. This is a fun, but totally normal result.

First of all, there are quite a few linear time sorting algorithms. Radix, pigeonhole, and counting sort are all linear time sorting algorithms. The popular result that any comparison-based sorting algorithm works in O(n log n) applies _specifically_ to comparison-based sorting algorithms, and not those like the above. So, even if you ignore the underlying mechanics of the OS scheduler and assume sleep works "perfectly", the result would not that unusual.

Second off, in the comments there appears to be considerable time worrying about the technicalities of the scheduler, the nondeterministic nature of sleep, and so on, and whether this really implies linear time bound. IMHO these are not worth worrying about because we already have linear time sorting algorithms. It's fine to assume the scheduler adds no significant asymptotic cost here, even if we know differently.

Third off, remember that all of these sorting bounds assume machines of the Von Neumann architecture. In particular, this model assumes constant time memory and constant time comparison operations. In cases where you're comparing really big numbers, these bounds get worse. This is easy to forget, but worth remembering just since we're on the subject anyway.


> there appears to be considerable time worrying about the technicalities of the scheduler […] whether this really implies linear time bound. IMHO these are not worth worrying about

But it is not linear time, we shouldn’t let the existence of other algorithms make us confused about proper analysis of this one.

The call to `sleep(n)` is not a trivial operation, it inserts our task into a priority queue, so we are piggybacking on the OS’ support for sorting tasks based on when they need to be woken up, and thus hide the actual complexity behind the call to `sleep` (which we call `n` times, so if it’s `lg(n)` then the time complexity of the entire algorithm is `O(n lg n)`).


It's also a good example of why O() isn't the entire story when it comes to how long an algorithm takes to run.


This is not "linear time" in any decent sense--we shouldn't conflate the number of elements with the size of the elements (in this case, O(max element) is actually exponential in the input size!).


This isn't really right. If the numbers can be arbitrary in size, then you can't compare them in constant time, which means that comparison-based sorts are quadratic.

This is directly related to my third point. Traditional analysis of sorting algorithms assumes the Von Neumann architecture, and in particular, it assumes constant-time comparisons.


The computational model will define what you can and cannot compare in constant time. It is perfectly fine to say that comparing elements of arbitrary size is a constant time operation - that's precisely what a computational model is: an enumeration of the operations that are considered basic.


Yeah, I know.

Go read the parent again. The author claims that sorting in this case takes time proportional to the size of largest element, and I'm saying, if it takes time proportional to the space consumption of in the largest element (presumably where the cost of the sort comes from), you can't define a computational model where comparison takes constant time -- it still has to read the digits, which we know takes linear time.

If your point is that you can define a model of computation with contradictions in it, then you should rethink whether what you are saying here is even relevant to the thread at all.


You can shurt circuit comparisons but that's not always going to happen. Consider it's completly reasonable to assume long int is sorted as fast as int on a 128 bit CPU. If nothing else it's useful to have a fairly long cycle even if you could in theory use a 20GHz CPU that has some comparisons finish in 4 cycles and others just 1. Sure, it might be slightly faster than a 5GHz CPU with constant time comparisons, but the added complexity is not free.


I am not sure I understand your point here.

Short circuiting comparisons is still linear for integers of unbounded size. But besides that, it's incredibly common to assume that 64-bit primitives are done in constant time, because to the asymptotic analysis, what matters is that the integers are _bounded_ in size. As long as it's bounded, the asymptotic analysis here doesn't care -- you can pick any size you want, as long as it's bounded.

One thing to keep in mind here (which it seems you are maybe confused about) is that these bounds are _asymptotics_ of the input size generally, and have very little to say about any particular choice of input size. 64-bit, 32-bit, whatever, it doesn't matter, if the size of the operands is bounded, the they're the same thing to the asymptotic analysis. As long as they're bounded, you can say that the operations on those operands is proportional to a constant times the (bounded) size of the operands.

The consequence of this is that when you deal with operands of _unbounded_ size, all of this asymptotic analysis goes right out the window. You can't possibly have a meaningful comparison operator, for example, that operates in constant time on unbounded integers.


With parrell analysis comparisons are log N operations. P1 compares N bits, P2 compares N bits, if use P1 unless it's equal then use P2.

Short circuting cuts that to log(log(N)) on average and log(N) worst case.

It's actually generally faster to compare X bytes of Vary large numbers than X Bytes of long int's. Degenerate case being 2 numbers of x/2 bytes.

Anyway, the point is treating log(log(N)) as constant is generally reasonable especially when N is small.


The discussion is about big O, yes? This isn't really relevant to our discussion. Parallel analysis and average case, for the purposes of the discussion here, are fancy games you can try to play to get nice bounds, but they do not really have any impact on the big O of these algorithms.


"You can't possibly have a meaningful comparison operator, for example, that operates in constant time on unbounded integers."

We don't really deal with unbound integers worst case is something like 2^(2^1,000,000,000,000,000,000)) before we can't actually store them.

Even then log of log of N makes random numbers of that size practically constant time to compare. Put another way there is less than 1 in 2^64 you need to do more than 3 comparisons and less than one in 2^128 you need more than 4. One for length, one for the chunk which might be just one bit and one more for the next 64 bits. Sure that might happen, but reolistically random input is unlikely to need many comparisons.

Stings on the other hand are more of an issue.


Retric, look.

Do you realize that the whole point of asymptotic analysis is that _we do not care_ what integers we "really" deal with? These bounds specifically deal with sorting arbitrary data of unbounded size, and _no other assumptions_.

Do you understand? No assumption that we can use parallelism. No games with integers you see in "real life". None of that is relevant. You can't play games with practicalities to get a better bound. This is the general bound, and if you fiddle with it to get something else, you are changing the scope of the question to be something else entirely.

If you want to have a discussion about those other models, fine, but that's not the discussion we're having.


What your talking about is a type of overly simplified model that's irrelevant for actual computing. My point is there are other models out there. If you actually want the fastest algorithm out there you need to design it for actual hardware with finite data, cache, limited registers, limited ram, and sometime aproximating a real instruction set etc.

However, by changing your assumptions you can still use O notation with more complex models.


Yes, everyone knows this. This is not helpful.


My point is indeed that "you can't compare [arbitrary sized numbers] in constant time" is a statement about some model of computation, and not all of them.


Again, yes, I know. I understand what the model is. I'm saying it's not useful to point out that you can choose contradicting axioms to base your model on. And make no mistake, the two axioms that you are proposing do contradict each other.

So again, the issue is not that you can't compare numbers of arbitrary magnitude in constant time -- I never said that. The issue is that it is a contradiction to say that it takes linear time to inspect all the digits of a number AND that it only takes constant time to compare O(n) digits of two numbers.

You're essentially saying that you can axiomatize your mathematical universe with nonsense axioms, which yes, you could do, but that's not useful to point out.


Of course the models I mentioned are not self-contradictory, I've no idea what you mean by that.

Your statement was "If the numbers can be arbitrary in size, then you can't compare them in constant time". I was merely pointing out that this is not a true statement.


flebron, you do not seem to be reading my posts and I am getting a bit annoyed.

Again, for the third time: you cannot "assume" that reading the digits of a number takes O(n), but reading the digits of two numbers to compare them takes O(1). That is absolutely, obviously true. Your response, that you can "assume" the latter is true as part of the model of computation is just wrong, plain and simple. There's nothing else to say about it.

The original claim directly implies this, and if you can't address that point, then just save us both the time and don't respond.


Just worthy of note here, we can do comparisons in linear time with respect to the number of bits, however OP's algorithm is exponential in the number of bits.


I thought that too, but couldn't you make element size irrelevant by renormalizing the list? One pass to find the max, one pass to divide everything by it.


Yea, it's linear time for n where n is the largest number—pretty sure it's unbounded in terms of numbers of elements.


True, but writing a linear sort with bounded number of distinct elements is not really a challenge.

E.g.

Intput, T: array of n elements, m: number of distinct elements.

    for i in 1..m
       for j in 1..n 
           if( T[j] == i ) print T[j]; 
This is O(n * m) which is O(n) since m is a constant.


the max size of the elements is usually constant when comparing sorting algorithms. The complexity of this depends on the algorithm of the scheduler. With the right scheduler this could be O(nlog n).


This is not a linear time sort. This is just sorting using a priority queue (aka heap sort) with arbitrary pauses thrown in. Here is a simplified model of what is going on (http://ideone.com/azLsTc). Note that line 31 (the sleep until) is totally irrelevant to the sorting.


No, sorry, that's not correct, it's not a decided issue whether it's linear time or not. It depends on what assumptions we _decide_ to make about the scheduler and the OS, what the API for sleep is, and so on.

One reasonable assumption is of course to assume that sleep works as it does on virtually all modern OSs, in which case, you're correct, but that is certainly not the only way it can be. You could easily imagine a specialized SortOS that executes one thread on one process and outputs messages at a specifically scheduled time after a program begins executing, in which case the result would be radix sort with extraneous waiting, rather than priority heap sort with extraneous waiting.


We don't have to make any assumptions, we know how schedulers work, and the radix sort counterpart of a priority queue would probably be a radix tree, which would be impractical for use in a scheduler.


It's completely false to say that "we know how schedulers work". The _only_ reason there is confusion in this thread and the 4chan thread is because so few people actually know anything about schedulers. If they did, no one would be confused. The fact that there is confusion at all is basically a huge point for explicitly stating your assumptions.

Moreover, that is essentially the whole point of the field of algorithms. Your job as an algorist is to abstract the algorithm away from the implementation details of wholly separate systems, and make all of your assumptions explicit. If you don't specify it, you can't analyze it. And, if an algorithm has to assume an entire, specific OS and scheduler is implemented under it, then you have gone about your job as an algorist completely wrong.

That is the appeal to practicality. But as if to prove my point, you _do_ end up making assumptions. You are assuming that the scheduler is backed by a priority queue, which is certainly not universally true. You don't get to a hard mathematical bound by waving away a detail as important as the data structure that backs the core sort mechanic. If your approach is to say "y'know, like a _normal_ scheduler", then you should stop and rethink your approach to the problem.


It's more like O(max(input)), not O(n) (see the first response)


You can always scan through the array, rescale everything by 1/max(input). Then the sleep part is O(1) and the scan part is O(n) -> O(n) total.


Division isn't O(1) for arbitrary values (potentially larger than a machine word).


The same is true for comparison though, isn't it?


Sleep maxes out at 2147483647 = 2^31 seconds. Generally these problems assume the transdichotomous model, where the machine word size matches the problem size.


Well. Let's distinguish between computational complexity and wall-clock time. The code you write yourself here performs operations with O(n) asymptotic complexity where n is the number of elements, but the system scheduler may be able to find additional work to do while the algorithm is waiting to resume, or even put the CPU into a lower-energy state.

Of course, making lots of system calls to create processes and store them in your operating system's process table almost certainly invokes operations with greater than O(n) asymptotic complexity, but this complexity is still unrelated to max(input).


Sadly, the somewhat lengthy Wikipedia entry for Sleep sort was deleted. Here's archive.org's version:

https://web.archive.org/web/20110622073615/http://en.wikiped...

Reading through Wikipedia's existing pages for other esoteric sorts makes me laugh as I did when reading Hitchhiker's Guide to the Galaxy's various entries for scientific theories (e.g. Bistromathics)...There's Bogosort, Stooge sort, American flag sort, and my favorite, "Gnome sort", named thusly because "that is 'how a gnome sorts a line of flower pots'" http://en.wikipedia.org/wiki/Gnome_sort


I like the comparison between the incredibly inefficient Bogosort:

    1) Shuffle the list.
    2) If it's sorted, halt.
    3) If it's not sorted, go to step 1.
and the incredibly efficient Quantum Bogosort:

    1) Shuffle the list.
    2) If it's sorted, halt.
    3) If it's not sorted, destroy the Universe.
According to the Many Worlds interpretation, step (1) creates multiple Universes, each with a different ordering of the list. Step (3) ensures that, by the anthropic principle, we must be in a Universe where the list was sorted correctly (otherwise we wouldn't be around to observe anything).


how does it destroy the universe though? I am really not very familiar with things in the quantum world and that kind of stuff.


That's an implementation detail ;)


Of course, why wouldn't WP delete it? <sarcasm>

I just found out that the WP page for the hornet archive, one of the most important focal points of the international demoscene for a decade, was deleted because it was't "notable".

So frustrating.


Shame, isn't it?


Unlike most esoteric algorithms, American flag sort is actually pretty useful in its niche of string sorting.


Gnome sort used in a practical application (for natural language parsing): http://stp.lingfil.uu.se/~nivre/docs/gotal.pdf


But why. I would think the gnome is the perfect size to pick up exactly one flower pot (the next smallest) and would also be able to recall a dividing line (because the gnome is tiny and prefers traveling less).


Didn't saw this post before

I heartily disagree with all the attempts to downplay the brilliance of the sleep sort algorithm. Many of you have missed the important point that while traditional sorting algorithms can only utilize one core, sleep sort has the capacity to use the full power of a massively parallel execution environment.

Given that you need nearly no computing in each of the threads, you can implement them using low-power CPUs, so this is in fact a GREEN COMPUTING algorithm.

Oh, and did I mention that the algorithm can also run inside a cloud...?

Sure, you're a genius!


One should know that it is the timer scheduler of the kernel which ends up doing the sorting. With Linux, this is probably a red-black tree[1], so performance of sleep sort (when it works) is O(n *log(n)).

[1] https://www.kernel.org/doc/Documentation/timers/hrtimers.txt


With fluxcapacitor it will run in fraction of a second!

https://github.com/majek/fluxcapacitor#basic-examples

$ ./fluxcapacitor examples/sleep_sort.sh 1 4 20 3 55


I remember a few years ago, I hacked up the Wine source code so some skiing game would run at half speed. It totally worked :) It was even simpler than fluxcapacitor since you don't need to hook anything but just change the implementation of some time related functions.


Pardon the shameless plug, but I've recently used a similarly unorthodox approach for replacing the heap inside Dijkstra's algorithm, and got a speedup - I use an array indexed by the vertices' current distances, and scan it from start to finish.

A paper is currently making the rounds, would love to hear feedback:

https://www.dropbox.com/s/zy1gx1azfvvqm6i/NetSciComm%20versi...


For the record, in that thread, there are implementations in Bash, Erlang, Haskell (I think), Perl, C, and Racket. It also is interspersed with entertaining banter and unentertaining racism. So, basically, a more caustic more technical less orderly version of HN. Think about that.

There's a reason 4chan should be considered a national...maybe not treasure, but it's pretty awesome.


Previous discussion: https://news.ycombinator.com/item?id=2657277

Pretty sure it's been posted a couple other times, but I'm on a tablet at the moment.


Recently I wrote about sleepsort for the Perl 6 advent calendar, mostly as a fun topic to talk about threads, promises etc.

http://perl6advent.wordpress.com/2014/12/23/webscale-sorting...


C implementation for Windows:

  #include <stdio.h>
  #include <windows.h>
  #include <process.h>
  
  void sleep_on_it(void* param)
  {
      int n = *(int*)param;
      Sleep(n);
      printf("%d ", n);
  }
  
  int main(int argc, char **argv)
  {
      int ar[argc];
      HANDLE threads[argc-1];
      for (int i = 1; i < argc; i++){
          ar[i] = atoi(argv[i]);
          threads[i-1] = (HANDLE)_beginthread(&sleep_on_it, 0,  &(ar[i]));
      }
      WaitForMultipleObjects(argc-1, threads, TRUE, INFINITE);
  }
Compile with gcc -std=c99 (because of VLA's it won't work in Visual Studio).

Example:

D:>sleepsort 1000 9 2 150 1 7 17 624

1 2 7 9 17 150 624 1000


Radix sort, with time as the buckets...stylish.


This guy makes the rounds every once in awhile it seems. Here's a /r/dailyprogrammer thread with solutions in various languages: http://www.reddit.com/r/dailyprogrammer/comments/yqydh/82420...


javascript version (faster):

    var input = [5, 3, 6, 3, 6, 3, 1, 4, 7];

    var timeAxis = [];
    // Put every element to the corresponding time point.
    // Time point would hold an array, as we can have
    // equal elements.
    for (var i = 0; i < input.length; i++) {
        elem = input[i];
        timeAxis[elem] = [elem].concat(timeAxis[elem] || []);
    }

    // skip all empty (UNDEFINED) time points
    // and contact all the arrays:
    var sorted = timeAxis.reduce(function (accum, cur) {
                                     return cur ? accum.concat(cur) : accum},
                                []);
    console.log(sorted)


The speedup is from usage of _virtual time_, which we don't need to wait for.


This is my new favorite after bogosort.

I'm sad nobody has implemented my idea, rock, paper, scissors sort.


My favorite is spaghetti sort. Make one spaghetti straw per item that is as long as the value of the item. Then take all spaghettis in your hand and put them against a flat surface. Remove the longest one and keep doing that. :)


And what, exactly, is rock paper scissors sort?


The comparison function is a game of Rock Paper Scissors with the winner being given the 'greater than' position. Games proceed until all items are in sorted order :)


lmao i actually looked that up, it even has a wikipedia entry. what a good read.


Care to share the look link? Couldn't find it.


I believe they were referring to bogosort: http://en.wikipedia.org/wiki/Bogosort


Do you mind if I submit a javascript version!?

http://js.do/code/48519

http://js.do/code/48526 (Optimized milliseconds). :)


if you'd like to trade space for time, here is another sort with no comparisons: http://codepad.org/0k2Ou2Db


I don't know Perl, so I rewrote it in Python in an attempt to understand what's going on: [link redacted]

I'm not storing key,value pairs for elements that don't exist (not sure what the Perl code's doing), so it shouldn't be too bad in terms of space used.


I'm not entirely sure about this, but it looks more like Counting sort [1], than Pigeonhole sort.

[1]: https://en.wikipedia.org/wiki/Counting_sort


(impure) ruby implementation:

ruby -e '[3,1,2].each{|n| system("sleep #{n} && echo #{n} &")}'




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: