During college junior year (1993) as a physics major I took a class in digital electronics (which included 68000 assembler programming). We had a lab contest to create the fastest sorting routine for a set of random unique numbers between x and y.
I won the contest by setting to 0 a y-x register range of memory and then inserting each number into the range based on the number itself ("27" into register 27, "18" into register 18, etc.). Then I printed out all non-zero numbers in the range.
The other 20 or so students did versions of bubble-sorting and called my solution a cheat. The professor defended my victory as I had not broken any of the rules of the contest...
Haha. Reminds me of a lab where we had to program a 6502 to play some sort of game, rendered on an oscilloscope. My genius friend stayed up all night writing code that rendered the snake by dynamically writing code, rather than the obvious approach of reading the “snake” coordinates and rendering those to the screen.
Our snake code ran at something obscene like 1,000 Hz. So much faster than anyone else’s. The professor couldn’t understand how we did it, so gave a low mark.
In High School 2004 I made a license plate detection system for our IT class project. It was not perfect, granted, but was able to locate license plates in a photograph pretty reliably and do OCR on them.
Teacher was like, nah, that's not possible, and clearly hadn't even read it. Gave me an A just to stop me complaining, still not understanding the project.
In a teaching lab in 1995, I had to make a digital thermometer using a mcu. I wrote code that dynamically read the voltage drop on the sensor and calculated the temperature. It was fast enough and easy to calibrate. The code was small and easy to understand.
I didn’t pass because the point of the assignment was to use lookup tables, even though that was a more complex approach in this case :(
Are you sure about easy to calibrate 8) Did your solution have the same level of accuracy across the range as the MCU would provide?
Bizarrely, you sometimes have to follow the standards because that is what everyone does. I can't think of an example now but there will and have been cases in engineering where the wrong answer is the right one because that is what is done. I know of mensuration devices that "model" some curves with a linear approximation and paper over the errors with cough error estimates and/or keeping within the nearly linear range. That too is fine if everyone understands what is going on.
Accuracy is a funny old thing.
However, I like your approach and it shows you probably understand the principles involved. Ideally, if you are going to be a smart arse like that, then submit two solutions - the proscribed one and your clever one.
Like meta programming, creating a string to define a query and then executing the string. In the case above, you dynamically create the program you need to solve the problem and then execute it. This is different than "solving the problem". You are writing a program to "write the program to solve the problem".
Doesn't sound like this kind of complicated string eval would be able to do 1000Hz but then again I also have no clue where you could possibly need that for a snake game. On an oscilloscope you'd need to produce vector lines to move from/to right?
More likely, I would wager that they were just overwriting the instructions to draw with new instructions that just inline everything so it didn't need to fetch out to anything.
That’s more like heapsort, because the OS will use a heap to implement a priority queue that contains times for when the next process needs to be scheduled.
The "set of random unique numbers between x and y" part stands out so strongly that I would be shocked if the whole point wasn't to encourage a non-comparative sort.
> I won the contest by setting to 0 a y-x register range of memory and then inserting each number into the range based on the number itself ("27" into register 27, "18" into register 18, etc.). Then I printed out all non-zero numbers in the range.
That's really smart. The others were just angry because they couldn't think outside of the box. Nice work!
That's not silly, that's the right way to solve that problem, unless y - x is large enough that you'd benefit from using one bit per number instead of one word. Well, usually calling qsort() would be fast enough, and less code.
That brings up memories! I did the same thing. We had to sort a million integers in the range 0-1000 (exclusive range). I did a counting sort, and beat every other solution handily.
In my case, though, the teacher disqualified my solution since the resulting lists didn't contain the same fixnums as the one i sorted, which I argued was stupid.
In the end I implemented a merge sort with some tricks and won anyway.
The same fixnums!? Fixnums are distinct from bignums in that they're small enough to fit in a machine word and therefore don't need to be objects allocated in the heap... Was this an odd implementation wherein fixnums were put into the heap anyway?
Nope. This was in PLT scheme. Fixnums were eq? to other fixnums, meaning they satisfied the most basic equality.
Arguing was futile. When we started benchmarking very large lists, the counting sort was the only one that did it in reasonable time. Despite being disqualified.
The observation that "random numbers can be sorted in linear time" is actually often useful in algorithm design. A typical application is storing a sorted list of hash values.
Basically pigeon hole or bucket sorting https://en.wikipedia.org/wiki/Pigeonhole_sort which, as another commenter mentioned, are types of non-comparison sorting so you can get near linear times
I think the mainstream sorting algorithms tend to cater towards "mainstream" ordering which tends to be somewhat sorted versus purely random. An example is https://en.wikipedia.org/wiki/Timsort
This is not the same as radix sort but you're right that radix is a non-comparative sorting algorithm just like the one the parent commenter described.
Is it really? Building the hash can be made in linear time but iterating the values in order, especially when there might be gaps in the interval? Could be messy
The inner loop takes an array and an index i in the array, and rearranges the array to put the largest value at index i. This is done in such a way that if the sub-array up to but not including i is sorted, then afterwords the sort will be extended such that the sub-array up to and including i will be sorted.
The outer loop says, just call in to the inner loop with i from 1 to n. This will result in a sorted array, because the initial array of a single element is trivially sorted, and then we'll have by the sort-extension property of the inner loop that the progressively larger sub-arrays of 2, 3, ... up to N elements become sorted.
You can see that the inner loop has the sort extension property by thinking about two cases.
Case 1: If the i-th element is already larger than all previous elements, it won't touch any of the elements before i, and will only potentially make the i-th element bigger, thus clearly extending the sort though the i-th element.
Case 2: If the i-th element is smaller than one of the previous elements, it won't touch any elements until reaching that smaller element; then it will start a series of swaps until it reaches index i, at which point the the original value of the i-th elment is left inserted in sorted position into the sub-array up to and including i. So at this moment we have the sort extension property. From there, it may increase the value of the i-th element and change later elements, but of both those will preserve the sub-array sort.
It is being likened to bubblesort in various places but your description (which I quite like) makes it seem like quicksort with an intentionally bad (worst possible?) bifurcation.
Well, it's not really dividing, sorting, and merging, so I wouldn't say it's too similar to quicksort.
It's basically insertion sort, but with the curious property that the last value of the sorted subarray immediately and always has the correct value, and we then "gently" insert new values into the sorted subarray.
Tap the "Insertion Sort" checkbox on the right, and then the "Start" button below that.
You can see how both sorts are slowly constructing a sorted subarray, but this sort has the interesting property that the tail of the sorted subarray immediately obtains its maximum brightness, and new values are gently incorporated into the array, while the insertion sort tail is just the brightest value encountered so far, and the latest value is "abrutly" sifted down into its appropriate place.
It feels like this sort might minimize some kind of "sort entropy" metric? I.e., visually it looks very smooth. Each step along the way to sorted is very small.
[Edit: I took a quick glance at the paper and got carried away; the rest of this comment is about a different algorithm apparently known as "exchange sort", and an interesting property of it. The algorithm in the paper is actually different, and indeed more surprising.]
Hey! That's my sorting algorithm! (Well, I'm sure it has been independently discovered many times, but one of those times was by me before I had learned of O(n log n) sorting; I used it many times including in the INOI 2004 programming contest, about which I asked the following question ten years ago in 2011 on math.SE about this algorithm and Catalan numbers, a truly surprising property that is not mentioned in this paper: https://math.stackexchange.com/questions/36022/how-do-the-ca...
> For situations where a quadratic-time sorting algorithm is fast enough, I usually use the following:
//Given array a[1], ... a[n]
for i = 1 to n:
for j = i+1 to n:
if a[i] > a[j]:
swap(a[i],a[j])
> It looks like bubble sort, but is closer to selection sort. It is easy to see why it works: in each iteration of the outer loop, `a[i]` is set to be the smallest element of `a[i…n]`.
----
Also: A few years ago I emailed Prof. Richard Stanley of Enumerative Combinatorics fame asking whether it was related to any of the hundreds of problems in his Catalan Numbers list, and he said he didn't know a direct connection at the time (so maybe it's an addition to his list?):
> > permutations equal to the product of all their inversions (in lexicographic order) [multiplying on the left]
> Example: The permutation 1423 has two inversions (24) and (34), and the product (34)(24) is the same as 1423.
> Non-example: The permutation 3412 has four inversions (13), (14), (23) and (24), and the product (24)(23)(14)(13) is 4321 which is not the same as 3412.
> Among the permutations of {1, 2, 3, 4}, there are 14 examples (and 10 non-examples).
Oh yes, good point: I got carried away; there is a difference! Simplicity is subjective (I think mine is simpler of course :P), but the algorithm in the post is indeed more surprising.
I actually did the opposite! I was taught this sort in high school (I think the teacher called it shuttle interchange?), and I independently came up with the bubble sort on my own as a more-efficient in-place implementation on disk.
Back in the 90’s for my high-school computer class, we were saving records on disk using TurboPascal and I was trying to sort my file in-place.
The sort was super slow, which I assumed was the disk head having to jump back and forth over larger distances of the array of records when comparing and swapping.
I reasoned if I could keep the compares and swaps between I and I+1, and then sweep those I’s forward, it’s the same number of overall compares but less head jumping with each compare/swap being adjacent records.
My implementation did speed things up quite a bit. I have no idea if my reasoning was right, or if I had some other inadvertent bottleneck. But I was proud of the speedier result.
I never explicitly pointed it out to the teacher, though, and didn’t get full marks on that homework because he said my overall program wasn’t user friendly enough.
Back in time I was sorting records on disk using Atari ST and (probably) Omikron BASIC. I applied RADIX sort (or a variant) that I saw in one magazine. It worked as a charm on RAM disk, but once when I tried it on a floppy disk, it just took forever.
// Given array a[1], ..., a[n]
for i = 1 to n - 1:
swapped = false
for j = i + 1 to n:
if a[i] > a[j]:
swapped = true
swap(a[i], a[j])
if swapped == false:
break
This should reduce time complexity to O( (n - 1)^2 ) and stop sorting after the remainder of the array is sorted and tested one time.
I see where you're going, but your implementation broke the sort.
As for time complexity, it changes from ϴ(n²) to O(n²). Alternatively, you can drop the O-notation and calculate precisely so as to include constant factor speedups, though your algorithm may end up slower in the worst case.
And it was answered successfully by pointing the same invariant they point in the paper.
I remember finding this program through via automatic program search, so the algorithm and the question of why it works as probably been discovered many times.
If only we had something that would allow to surface information from the vast amount of data available...
Yes, but you asked a computer science question with computer code instead of "pseudo code and arguments of correctness.", as Rafael kindly reminded you. :)
More like, GistNoesis used 0-based arrays and the paper uses 1-based arrays. “Arguments of correctness” sounds like Rafael basically told GistNoesis to answer themselves, which is of course a widely spread attitude on SE.
Wait a minute! What the hell is ‘automatic program search’? Is it like with Malbolge where the first working programs had to be written through brute-force search?
You define some test-unit cases, input-output pair.
You define some elementary blocks like in scratch (Blockly). And you try to randomly assemble the blocks so that they respect the type constraints. And you score your program based on the number of test case it got right.
For this simple problem, naive random search was all that was needed. (less than 100k programs evaluations)
But for more advanced cases you could use something like Genetic Programming with Koza's Automatically Defined Functions where you build an intermediate library of useful blocks.
Now one would use deep learning and produce source code directly with some tool like Copilot and run it until it passes all tests. Or if you want to get your hands dirty, you can re-implement the old school methods and use a neural network to make the pick decisions instead of a random pick.
Once you have programs that pass all test cases, you can try to prove correctness (that's more something like HOL-Light would do) and whether or not they are equivalent (and equivalent to a formal specification) and optimize them further by transforming the program.
I think we should start recognizing more alternate information sources, such as blogs and stack exchange (i.e. attribute scientific credit and recognition to those posts).
Of course traditional scientific publishing has its place (including the care for methodology and source citation)... but many people will search directly those websites instead, so clearly it's an important emerging source of information and discovery.
All today. I have been trying to carefully craft searches on Google and Stack Overflow. I wish Google Code Search were still around, or that Github code search were better.
Apparently this was mentioned in a problem set for MIT edX course 6.00.1x in 2014. Then maybe someone then tried to have the internet do their homework for them (this may also be the case for some of the posts in my parent comment):
Shuffling is even less tricky if you're not trying to do it in place, if you're not trying to do it in linear(ithmic) time, or if you're not trying to induce a uniform distribution over all permutations (but rather just want some positive probability for all of them). Relaxing any combination of these conditions works just fine for Bogosort.
My gut says that combining that with other sort cycles in some magic ratio could do magical things. I'm quite wrong quite often ofc but the adventure is out there!
Fun fact: Fisher-Yates samples a uniform distribution over permutations and is therefor capable of producing any permutation, so you can sort any array in linear time using Fisher-Yates with its RNG replaced by a appropriate oracle. (So the complexity of sorting a array mostly reduces to the complexity of determining which permutation it's in relative to a sorted version.)
You can turn a sort into a shuffle by replacing the comparator with a random coin flip, and you already have defined a sort algorithm, so there's no added code complexity.
fn bogosort(a, comparator):
while not is_sorted(a, comparator) do
bogosort(a, rand)
You can't turn all sorts (not even all comparison sorts!) into a shuffle by replacing the comparator with a random coin flip. In fact your code post is a demonstration of that fact.
Assuming GP means General Purpose, yes, everyone who took an intro to algorithms should know that no sorting algorithm that relies on comparisons can do better than O(n*log(n)).
But non-comparison algorithms [0] like radix sort [1] that rely on simplifying assumptions are not a joke and can be used in practice in many real-world cases.
No general purpose sorting algorithm period can do better than O(n * log(n)), regardless of comparison or otherwise. For a radix sort with a complexity of O(n * w) where w is the key length, w converges to log(n) in the general case since the number of bits needed to represent n values grows logarithmically with n, hence you arrive back to O(n * log(n)). The O(n * log(n)) is an information theoretic result that applies even to radix sort, to be more precise the optimal sorting algorithm is Θ(n!).
If you fix w to a constant, then certainly one can claim that radix sort is O(n), but then so are comparison sorts as well.
That's not to say that radix sort isn't useful in practice, just because two sorting algorithms have the same asymptotic bounds doesn't mean they are equal, but it is to say that its performance is a matter of evaluating the constants rather than how it scales.
> since the number of bits needed to represent n values grows logarithmically with n
What you say only applies to sorting n unique/distinct keys. With no prohibition on duplicate keys, n & w vary quite independently. In this more general setting, radix sorting is indeed "really" O(nw) and w could be like 8 bits for some giant data set (EDIT and order is absolutely a useful property even with dups since there can be satellite data with the keys - also possibly bounded!). This is a common talking past each other problem in this area. To your credit you made this assumption explicit. (EDIT: "General purpose" may have some semantic unclarity, I suppose.)
In terms of relevance, everyone has their own, but my personal experience is that most (but not all) real world data allow both duplicate keys and bounding key width. In particular, the largest n data sets have been the least likely to have a prohibition on duplicate keys, and also the most likely to have small w (with a few exceptions). The random access pattern of memory writes in the passes of a radix sort may (or may not) be a problem, and constant factors matter, of course (as you also say). So tying w to n was has always seems a very special and somewhat artificial case to me.
I also have never heard of any practical O(n) comparison sort for fixed w before. Can you perhaps give a reference to some such algorithm? Thanks!
At some level, even comparing ‘a<b’ has a complexity related to the log of the magnitude of a and b. So if you’re doing a comparison sort over n unique values surely you’re going to run into a similar log n factor that the gp threw at the radix sort?
It depends how you count. There can be O(n log(n)^2) as you say, but I believe there are some comparison based algos where you only have to compare "fewer bits" to decide each comparison. So bit-wise cmps might still be O(n log n). However, you do have to move whole w-bit keys which is w bits wide. So, in full cost (cmp + space moves) unless you can do tricks to avoid the whole move, the comparison sorts do scale worse than radix-like digital sorts. And on general purpose CPUs people certainly compare batches of 8/16/32/64 bits at a time in HW. Personally, I think radix really does scale better, but maybe @Kranar can clarify his claim. (EDIT: there is often a random write pattern to digital sorts that can slow them down at massive scales, but that is definitely an {important} constant factor/different kind of "real machine" analysis.)
Maybe it has more to do with the actual distribution/form of the data? From my understanding radix sort can be incredibly fast when the numbers have fixed digits and evenly distributed in a predefined range (like phone numbers) but will be ineffective for floating point numbers or uneven distributions with extreme outliers.
That is why I specified "General Purpose". The moment you know something more about the data there might exist an opportunity to exploit that information to improve the algorithm.
Which is an interesting topic in itself. Many times I have met people who said with absolute authority "no, it is not possible to implement this" only to be confronted with a counterexample. And that is because real life is rarely about spherical cows in vacuum.
I have once implemented a transactional, log-based database for credit card terminal (20MHz ARM, 2MB total unified memory).
The read (get value for key) operation was O(n^3) which the reviewers decided is "ABSOLUTELY FUCKING UNACCEPTABLE" (their words).
The fun started when I asked them to suggest better implementation. They could not come up with anything even remotely as fast.
And the basic reason was that n was guaranteed to be small as there wasn't even much space on the device for any more data. Other than that it exploited fantastic read ahead capability of the device plus it was so simple (basically couple nested loops) that CPU cache could be used very effectively.
I remember their O(nlog(n)) was never faster and actually about 50 times slower in most operations.
I think main reason is our flawed education. The tasks you are being given at school are usually constructed so that they fit the idealized knowledge you just learned.
What it teaches people is to short cut the critical path of reasoning which is to first make sure you actually understand the problem and then the problem actually fits the assumptions. People are not taught that even a small divergence from assumptions can have dramatic difference on the actual solution or applicability of what they think is a solution. They are also not taught the actual assumptions or spot how they look. They talk about CS algorithms as if they worked on idealized computers, but then run them on real ones.
At school this is not taught because when it seems the problem fits the material it almost always is. People are actually rewarded for jumping to conclusions as this helps them go through material faster.
If you don't learn anything about the world you bring your learning with you and try to apply it to real world and that's how a lot of developers operate.
Especially in tech, you can usually look up solution to any problem easily. But for some reason this is the part everybody emphasizes rather than recognize knowing when to apply the solution is way more valuable.
I also think it is part of the wisdom behind "premature optimization is root of all evil".
I once watched someone implement a heap in PHP in order to sort 4 domain names in order of priority and be able to resort them. The idea was I guess that at some point there may be 5 or even 6 of these and then we would really need the heap…
> There is nothing good about this algorithm. It is slow – the algorithm
obviously runs in Θ(n²) time, whether worst-case, average-case or best-case.
It unnecessarily compares all pairs of positions, twice (but see Section 3).
There seems to be no intuition behind it, and its correctness is not entirely
obvious. You certainly do not want to use it as a first example to introduce
students to sorting algorithms. It is not stable, does not work well for
external sorting, cannot sort inputs arriving online, and does not benefit
from partially sorted inputs. Its only appeal may be its simplicity, in terms
of lines of code and the “symmetry” of the two loops.
Do you really use this algorithm? Look more closely, if compares every element against every other element, and the comparison appears to be in the wrong order.
This is not the Bubble Sort[0] as defined in Wikipedia. For one thing, it always has n^2 comparisons, is not always comparing adjacent elements, and does not terminate early.
[0] There are some who disagree and claim that it is.
It finds the largest number and places it at `i`. In the next iteration the largest number from 0 to `i` is placed at `i+1` Right up until `j` reaches `i`. Leaving a trail of orderliness.
As the value of `i` increases we approach and eventually equal ascending order.
What’s non-intuitive or what makes the comparison in the wrong order?
It’s much easier to explain if we just stop `j` at `i` each time.
Do you have any (publicly-available) course materials that mention this algorithm? The OP mentions they were "unable to find any references to it", and I've been collecting references I find in this comment: https://news.ycombinator.com/item?id=28760986#28762275
> What’s non-intuitive or what makes the comparison in the wrong order?
In a sibling comment, 'ColinWright compares this algorithm to bubble sort, which has the comparison in the other direction. Another similar algorithm is this selection sort which differs by one character:
for i = 1 to n do
for j = i to n do
if A[i] < A[j] then
swap A[i] and A[j]
Well, for me what's non-intuitive is that if we started with a fairly-well ordered array:
[ 1, 2, 3, 5, 8, 4, 9 ]
The very first thing the algorithm would do would be to swap 1 and 2, even though they are already in the right order. The next thing it would do would be to swap 2 and 3, making it even worse. And so on, all the way down the line. It takes six swaps to put 1 finally back to where it belongs, and the rest of the array is hopelessly out of order.
It does work, yes, and you can eventually grok it, but I feel like you're not being honest to call this a completely intuitive algorithm to new CS students.
Multiple comments elsewhere have been made by people who initially thought it was a Bubble Sort, and every other course and tutorial I know of starts with the Bubble Sort. So I admit to being surprised that you use this to teach algorithms. Personally, I find the algorithm given here completely unobvious, but perhaps that's just because I derived the Bubble Sort for myself back in the mid 70s, then read about O(n.log n) algorithms, and was hooked.
For me, this algorithm is very misleading, but if you find it a useful teaching tool then I've learned something, and I find it intriguing. I'd be interested to know if it's used in any other courses. Do you know of any?
snip description
> What’s non-intuitive or what makes the comparison in the wrong order?
It works, so obviously the comparison is right. But compare it with a classical Bubble Sort:
procedure bubbleSort(
A : list of sortable items
)
n := length(A)
repeat
swapped := false
for i := 1 to n-1 inclusive do
/* if these are out of order */
if A[i-1] > A[i] then
/* swap them and remember
something changed */
swap(A[i-1], A[i])
swapped := true
end if
end for
until not swapped
end procedure
It's looking to see if A[i-1] > A[i], because then they are the wrong way round. Compare that with the code in this submission:
if A[i] < A[j] then
...
It's the other direction, which pulls me up short.
And it's comparing every place with every other place, whereas Bubble Sort only ever compares adjacent items.
Being as familiar as I am with sorting algorithms in general, this, to me, feels unintuitive.
I see now that people seem to be equating intuition with making steady, obvious progress.
If you think about the point of all the adjacent comparisons in the inner loop for bubble sort is, it is finding the largest number and moving it to the last place in the array. It keeps doing that in a loop, the next time the second largest is found and at the second last place, etc.
This algorithm doesn’t particularly do anything different, it just finds the largest and places it at `i` instead of the last place in the array. You’ll find that as it does it’s thing the array from 0 to `i` is always in ascending order.
It’s very easy to teach this algorithm, the next variation that stops the inner loop at `i` instead of `n`, and the next variation that doesn’t do unnecessary swaps, which ends up being insertion sort I think? I would have to check.
> This algorithm ... finds the largest and places it at `i` instead of the last place in the array.
This is immediately where my intuition for the correctness of this algorithm falls down. When `i` is 1 (in a 1-based array) this algorithm starts by putting the largest element in the 1st location.
Then it proceeds by putting the largest element in the 2nd location, then the 3rd, then the 4th, and so on, so I can see that the largest element ends up in the last place. What is harder to see is that in all the swapping about, it's guaranteed that the smallest ends up in the 1st location.
> * You’ll find that as it does it’s thing the array from 0 to `i` is always in ascending order.*
This requires (a) noticing that it's the invariant you want, and (b) showing that it's true. It feels like a lot more work than proving Bubble Sort works, and to me, it's a lot less "obvious" (for want of a better term). But in particular, you refer to "making steady, obvious progress." It's not obvious to me that in the mid-phases we are making progress. The largest element continues trudging upwards, yes, but it's not at all clear to me that what it leaves behind is ordered, and not chaotic.
I recognise that I'm not you, and I'm not in the same position as your students, but I still find this algorithm requiring a lot more work to understand than the Bubble Sort.
Is there any particular reason you use this algorithm as your starting point for teaching sorting?
The algorithm is essentially just inserting the i + 1 th element into the sorted subarray from 0 to i, and it inserts in the correct place. I’ll be teaching this next week or the one after, I’ll try to record a video and share if I remember. It’s harder to explain with just words.
I'd be fascinated to see this ... if you can do that I'd be grateful ... thank you.
(Copying from your sibling comment)
> It’s a lot more work to massage bubble sort into the other sorting algorithms they need to learn without scraping the code and starting from scratch.
An interesting observation. When I teach equivalent things I tend to present them separately and individually, so the students recognise that they are different in important ways, and then cross-connect them with transformations or other bridging techniques. Then they can see that they are different, but related.
I'll think about your approach next time I do something like this.
Meanwhile, if you would be willing to share a video of teaching this topic I'd be pleased to see it ... contact details are in my profile if you'd prefer direct contact, although I'm sure others would also be interested.
Your original comment does not show that you are one of the few able to just read and understand a simple algorithm. Look around the many comments here where people misread this simple algorithm and readily jumps to premature conclusions. In that light I don't think you can say "there is an obvious intuition".
There are others in this discussion who are claiming that it is Bubble Sort. Some have recanted, others are sticking to their guns.
I am in email conversation with someone who is claiming that it is just a more efficient version of the original Bubble Sort, and I'm trying to get the time to track down their references.
So:
I don't think it is Bubble Sort, I think it's clearly different, I have said so elsewhere, not everyone agrees.
Hmm, come to think about it, it does have a lot going for it as an illustrative algorithm!
- It’s really simple
- It’s a bit counterintuitive, so makes a cute puzzle
- It’s a good exercise for a first correctness proof
- And it’s obviously a pretty pessimal algorithm, so students shouldn’t be tempted to use it in real code (hopefully...) Compared to bubble sort, which seems somewhat plausible, so people end up actually using it for real even though it’s a terrible algorithm.
Right. Isn't the intuition that the comparison and swap rule is applied to every possible combination? Put another way: How can you rearrange everything based on order and expect the net result to not be ordered at the end?
This is fun but I do wish papers would stop using "obvious" or "not difficult" so frequently. This paper has five uses of "obvious," 3 uses of "clear" or "clearly," a "should not be difficult," a "what can possibly be simpler?," and a "It is difficult to imagine that this algorithm was not discovered before."
Is this a particularly complex algorithm or proof? No. But it reads like the author was terrified that someone would be able to understand their paper simply by reading it once and worried that would make the author look dumb. That's backwards, though. Easy to understand concepts are a good thing.
Of the five "obvious" words, two of them are used to say that something is not obvious and your last two examples are something entirely different. The uses here seem quite reasonable. But these words should be helpful, not insulting! If it's not clear, as the author claims, then you've likely misunderstood something. No point pondering it for twenty minutes, better go back and reread the statement first.
But of course they do get misused and there's not much more demoralizing than getting stuck on something 'obvious.'
Plus I completely dislike the title. These kind of papers treat arxiv as if it is discussion group. It is not. It is paper repository for things that has been written with lot of care and thought.
This is getting more and more common in science, as a researcher's reputation and funding is more and more connected to their ability to get media/internet attention.
This comment made me weep internally. It's soul-crushing how relentless self-promotion is now as necessary good science for a "successful" academic career. (We can define that as getting paid anything at all by a university.)
I never really had the intellectual talent or salesmanship for that, my gripe is that academia has this unsettling undercurrent of boasting and bluffing and backstabbing and it makes it just an unpleasant community in which to forge a livelihood.
Surely a healthy measure of these things has always existed, but the incentives has become perversely perverse and it really drives away people with a genuine aptitude and interest from areas where they can make genuine discoveries about nature and contribute to the sum of human knowledge.
Oh yeah. I'm miserable. I have a 'dream career' but the reality of it is so unsettling. I work in a laboratory where collaboration only happens if it directly benefits someone's career in an "I'm famous" way... not an "I contributed to an important work" way. This is not unique. It's systemic.
My job largely revolves around seeming important, not doing good work.
>> My job largely revolves around seeming important, not doing good work.
That happens in companies too. It's really strange. OTOH we also know that any objective measures will be gamed, which I guess is the same thing. Perceptions are more important than reality sometimes.
Yeah, this doesn't have anything to do with academic "self-promoting", it's just a fun little paper on Arxiv. (If the author really wanted self-promotion, they should have used Twitter instead of that boring red site...) Don't really understand why people have issues with this one.
Is this really a new algorithm? I'm fairly sure my "Algorithms" professor* taught us this sorting algorithm twenty years ago. To be fair I don't think it was taught as a serious contender, not something to be implemented in production in the future, but instead as the baseline to measure the other algorithms, and also as teaching tool to introduce the fact that computers can't "instinctively look over" the whole set (like a human sorting a small set would do) but instead have to purposefully compare everything to everything.
*: (or maybe it was my "Data Structures" professor, I don't remember which of those basic courses covered the basics of sorting algorithms)
I obviously wasn't in your class, but what your describing sounds more like the role 'bubble sort' plays in comp sci education. And while this looks similar to bubble sort it is very different in behavior.
I distinctively remember that bubble sort was presented later. We started with this naive double loop (and the naive simple sort as a sort into a different copy alternative) and incrementally added optimisations to reach the various named algorithms
Pretty much everything you, I, or someone else is likely to come up with, someone else will have thought of before. That's why we usually attribute inventions and discoveries to the first people to bring to market or publish.
Given the simplicity of this, obviously other people have found this algorithm before. But this might be the first publication mentioning it.
Slightly OT, but it always slightly bugs me that sorting algorithms are taught *only* in the context of computers, which obey different physics to sorting things. Has anyone done work on sorting algorithms under different cost functions than just "minimise comparisons"? I'm thinking something like:
1. Allocation of new space is free, at least up to 2x the input size.
2. Moving a contiguous set of elements is as cheap as moving one element.
3. Moving an element, or set of elements, has a cost proportional to the distance moved.
To sort ~500 student tests by name, my favorite algorithm is
* Each person grabs a bunch of test. Split the test in 4 piles, using the first letter: A-F, G-L, M-Q, R-Z The same 4 piles for everyone. (I don't remember the exact borders. It may depend on your language. Perhaps 5 piles is better.)
* Each persons grabs a pile. Split each pile in subpiles by the first letter again. If some letter is too popular, split it using the second letter.
* Sort each small subpile, that has only 10-20 tests, probably using insert sort.
* Make a giant pile.
This can be classified as a combination of radix sort and insert sort. It's easy to parallelize to 5-10 persons.
If you later realize there are ~50 more test, just sort them using your favorite method and then use merge them, like in mergesort (or timsort?).
Yeah it’s a little different because the human mind isn’t procedural — you can perceive multiple cards at once, at a fuzzy resolution. You look at a hand and see “okay an ace, 3 face cards, some other cards” then scan to resolve. So an optimal algorithm needs to take that cognition into account.
I'd say both weird and coincidental :-). I'm the author of that StackOverflow question and have nothing to do with that paper. As far as I understand arXiv (I'm not familiar with it), the paper was submitted to arXiv before the StackOverflow answer that inspired my question, and published by arXiv after my StackOverflow question: https://arxiv.org/list/cs.DS/recent
Yes, and the paper makes no mention of it, hm. Sorry to be a grumpy gatekeeper, but this really belongs as a blog post, not an arxiv paper, unless it were to cite more sources at least.
It now serves as a reference for this algorithm, something that was missing before and not well served by a blog post. It does not claim to have invented the algorithm.
Back in the 80's and 90's, working on consoles in very tight space constraints, there were times we’d go with a horribly inefficient implementation of something like sort, because we only needed to sort 8 things and the savings in bytes of code was more important than the savings in time from a “better” algorithm.
Those abominations could be tweaked throughout the development of the game, and along the way we'd sometimes "rediscover" a more common (sane) algorithm. The refinement in this paper reminded me a lot of that.
I think I saw something similar many years ago in a side competition of the national Math Olympiad were you can use computers. [spoiler alert] IIRC one student used a weird sorting algorithm, and after looking at it for a long time and testing, we called it a very inefficient insertion sort, and IIRC it was in the "wrong" direction. So my guess is that the student used this algorithm or something very similar.
I turned in a similar algorithm for a college homework assignment, as a practical joke. (I'd been programming for 10 years already before I was able to go to college, so I was extremely bored in my first year Computer Science classes.)
I'm pretty sure I used the double-loop technique to drive it, but mine wasn't quite as simple because I layered the joke by also using multiple XOR, instead of swap, to propagate the values through the array.
Well, many people who tried to implement the bubble sort for the first time “invented” this algo, it’s just not optimized bubble rather then insertion sort, is it worth an article?
If you read elsewhere in this thread you'll see that there really is enough discussion to suggest that yes, this is worth an article.
And it really, really isn't Bubble Sort. One of the defining features of Bubble Sort is that it only ever compares elements in adjacent locations. Others are claiming that if you optimise this then it becomes Bubble Sort, but someone else observes that if you "optimise" differently you end up with Insertion Sort.
That suggests that this is not Bubble Sort. It's something different that can be mutated into Bubble Sort, but it is, of itself, not Bubble Sort.
I recall some kind of sort that requires a construction of link list for storing index numbers of the original array.
Only that the final link list (of index numbers) were in sorted order of which you then summarily write in the final array.
Which would have consumed some (n*(row-size)+n*(link-ptr-size+value-size) memory outside of its original array.
probably did O(n^1.5 for read operation which is WAY better than O(n^2) for bubble-sort.
Best in-its class algorithm for swap at O(2), worst case?
Delving into swap breakdown, its write operation might be O(3n+1) for writing a value then updating a link list pointer for each entry and for its link value into that final array; +1 for that NULL pointer at the end.
was arguably the lowest O (way back then) but at extremely high cost of memory.
Never did learn which sort class did that variant falls into.
Now I remember, it was a college try to improve the read operation of insertion-class sort algorithm.
One guy knocked down the time order considerably by doing tiny pre-bubblesort in blocks before entering into the link-insertion thereby making less passes over its million link-list chains: but only because he leveraged to fitting within its CPU L2/L3 cache. Professor said that the order got worse but not by huge logarithmic scale despite time speed-ups for completion.
Dang, the O order just got complicated for multi-level caches.
QuantumBogoSort a quantum sorting algorithm which can sort any list in O(1), using the "many worlds" interpretation of quantum mechanics.
It works as follows:
1. Quantumly randomise the list, such that there is no way of knowing what order the list is in until it is observed. This will divide the universe into O(n!) universes; however, the division has no cost, as it happens constantly anyway.
2. If the list is not sorted, destroy the universe. (This operation is left as an exercise to the reader.)
3. All remaining universes contain lists which are sorted.
Holy shit. I independently invented... well, not quite "invented", more like "stumbled into" this algorithm when I was in elementary school learning Applesoft BASIC, and wanted to sort an array of numbers. I had no clue how to do it and just intuited there had to be two loops, one inside the other. I got the comparison wrong, saw that it sorted backwards, and just changed > to < without asking myself why it worked.
I suggest you read elsewhere in this discussion, where many people have claimed it is a Bubble Sort, and most have then changed their minds.
Consider, a Bubble Sort (as documented in TAoCP, for one) only ever compares elements in adjacent locations, whereas this routine compares every location to every other location.
Then think trice. Selection sort gets its name from doing work to find, select, the overall smallest element. This algorithm just takes the element at position i and inserts it at the so far right place in the sorted subarray to the left of i.
It's an inefficient (and somewhat obscure) variant of insertion sort.
I had a similar experience in college. The assignment was to print out a tree structure with any given initialization values. The professor and everyone else in the class printed the tree top down, like tree diagrams in textbooks. This required a bunch of fiddly spacing considerations. I realized that printing it out left to right required far simpler spacing. I think the final code was less than 20 lines with the core logic being 7 lines. The professor was very surprised it worked.
"It is difficult to imagine that this algorithm was not discovered before, but we are unable to find any references to it." -- I've always maintained that for academics, if somebody hasn't written a paper for it, it doesn't exist. I can say I've seen this in actual code, in the wild, and I'm sure many others have.
It's more like, if it you can't find it, it doesn't exist. Seeing something published is probably the most common way for academics to "find" something, but if this algorithm was present in a lot of different code bases, made its way into the Linux kernel, etc., that would have been good enough, too.
I've been using the same program for years (since 2015) for interviews, intentionally adding some issues and asking what the code does, how to fix the bug(s), what to name the function, what are some properties that are problematic, etc.
Kind of a fun interview, but actually explores peoples knowledge.
This is a great example of why state is so hard. The algorithm is so simple is could easily come out of program designed to spit out all toy Algol programs, and yet what it does very unclear.
Not entirely unrelated, learning how merge sort worked and why it was more efficient than the naieve first guess of insertion sort was the light bulb moment in my understanding of computer science.
The algorithm from the paper is indeed an astonishingly simple sorting algorithm, and it is indeed surprising that it works:
void
cantbelievesort(int *p, size_t n)
{
int tmp;
for (size_t i = 0; i < n; i++) {
for (size_t j = 0; j < n; j++) {
if (p[i] < p[j]) tmp = p[i], p[i] = p[j], p[j] = tmp;
}
}
}
That's 10 lines of code according to David A. Wheeler's 'SLOCCount,' though arguably 9 would be fairer. Its cyclomatic complexity is 3, with a deepest statement nesting level of 4, and it has 5 variables: three locals plus two arguments. It compiles to these 17 amd64 instructions, occupying 44 bytes of code, although I could probably shave off a third of that if I wrote it by hand:
However, arguably, this sort routine is even simpler. It may or may not be surprising that it works:
void
dumbsort(int *p, size_t n)
{
int tmp;
for (size_t i = 1; i < n; i++) {
if (p[i] < p[i-1]) tmp = p[i], p[i] = p[i-1], p[i-1] = tmp, i = 0;
}
}
By the same metric, that's only 8 lines of code (also only 8 by my "fairer" metric), its cyclomatic complexity is only 2, its deepest statement nesting level is only 3, and it has only 4 variables (the same two arguments, but two locals instead of three). It compiles to only 15 amd64 instructions, occupying only 43 bytes:
On every complexity metric that was ready to hand, dumbsort is simpler than cantbelievesort. Except computational complexity: dumbsort is O(N³) while cantbelievesort is only O(N²).
Dumbsort is similar to gnome sort, but arguably somewhat simpler: whenever it finds an out-of-order pair, after reversing it, it throws up its hands and starts over (i = 0, xor %eax, %eax).
I don't think I invented it, but I can't remember who did. I'd be grateful for pointers.
Fun algorithm! But I think translating to C and counting lines of code ends up overestimating the complexity of looping from 1 to n. And using a temporary to swap is definitely overestimating the complexity, though that doesn't affect the comparison.
If we treat looping up to n as one line of code, then both algorithms have the same count. And I think it's pretty fair to do that. Fortran, BASIC, APL, Perl, and so many others have syntax that's basically "for i = x to y". Let alone mathy languages that can loop over an array with a single character. Honestly, C is kind of weird for not having syntax for that.
For cyclomatic complexity, I don't know, setting i to 0 seems like cheating.
You definitely win on variable use.
Separate from the analysis, I have to wonder if there's a clever way to work in a saturating i-=2 instead of i=0. Then you can have N² efficiency without really changing the complexity.
Any measurement of "code complexity" is relative to some particular computational basis; which one do we choose?
In APL sorting the array P in ascending order is just
P[⍋P]
which result you can optionally assign back to P by prepending "P←". So, by the APL measure, none of these algorithms is "as simple as" invoking the builtin sort operator. (Which is probably also faster.) APL doesn't actually have any 'syntax that's basically "for i = x to y".'
If we take Perl golf as our measure of "code complexity" you can't use a for-loop for dumbsort, because zeroing the ostensible loop counter doesn't reset the loop; there's a hidden loop counter you can't alter. You have to use an explicit while loop, which makes the cantbelievesort much simpler. Smalltalk and Python are the same way. But of course the shortest way to sort in all three languages is to invoke the built-in sort function: sort @p, p asSortedCollection, p.sort().
(You're right about FORTRAN and BASIC, though. You can futz with loop counters in those languages, just like in C.)
That's why I chose C and machine code as my metric. Smalltalk and Python maintain this hidden loop counter in the implementation, and also do garbage collection behind the scenes (which you might think was irrelevant, but the write barrier can significantly slow down heavily mutating algorithms like this one) and bounds checking. Numpy and APL additionally do implicit array broadcasting. All of these languages (except FORTRAN and BASIC) have built-in sort routines that are simpler to invoke than writing your own sort function. So I was looking for the language that would keep me as honest as possible.
But of course in a different machine language the answer might be different. For example, I haven't seen an instruction set with dedicated looping instructions that support nested loops, but make it inconvenient to clobber the loop counter in the middle of the loop, but it would be easy to design one, and there might be some advantage to doing so. (Code density, maybe, or unrolling iterations of the loop into a pipeline.) Then dumbsort would lose, because as in Python, it would have to be written as a regular while loop, which usually costs four instructions: an increment, a compare, a conditional jump, and an unconditional jump.
I think the i-=2 tweak you suggest converts dumbsort into gnomesort, which is just a sort of pessimized insertion sort. As your "saturating" comment suggests, gnomesort needs an additional operation to avoid running off the beginning of the array; some instruction sets do provide saturating arithmetic (it's often useful for DSP and graphics, even if it's not as often useful for indexing arrays), and many others provide a binary "max" operation (SSE's PMAXSW and friends, as well as, say, APL ⌈) which could convert an ordinary subtraction into a saturating subtraction.
I don't have a rigorous method here, but I'd say a builtin sort is cheating a complexity count while "for each" is fine.
> If we take Perl golf as our measure of "code complexity" you can't use a for-loop for dumbsort, because zeroing the ostensible loop counter doesn't reset the loop; there's a hidden loop counter you can't alter. You have to use an explicit while loop, which makes the cantbelievesort much simpler. Smalltalk and Python are the same way.
I think it can be reasonable to say that manipulating a loop counter is more complex than a strict loop over elements. Though you can get around that issue by using tail recursion for dumbsort instead of setting i to 0.
> But of course in a different machine language the answer might be different. For example, I haven't seen an instruction set with dedicated looping instructions that support nested loops, but make it inconvenient to clobber the loop counter in the middle of the loop, but it would be easy to design one, and there might be some advantage to doing so.
Maybe not nestable ones, but that's just a quirk of trying to have an especially compact instruction encoding. If the common case can be a single opcode like LOOP, or only a couple opcodes in a RISC architecture, I think it's reasonable to call it "one line of code".
> I think the i-=2 tweak you suggest converts dumbsort into gnomesort, which is just a sort of pessimized insertion sort. As your "saturating" comment suggests, gnomesort needs an additional operation to avoid running off the beginning of the array; some instruction sets do provide saturating arithmetic (it's often useful for DSP and graphics, even if it's not as often useful for indexing arrays), and many others provide a binary "max" operation (SSE's PMAXSW and friends, as well as, say, APL ⌈) which could convert an ordinary subtraction into a saturating subtraction.
Is there a way to write gnomesort that only has two simple conditionals? You're right that my suggestion brings it a lot closer to gnomesort, but I'm definitely not suggesting anything that checks if you're at zero.
I hadn't thought of writing a tail-recursive dumbsort, though I'm not sure this is really simpler (untested):
void
dumbsort(int *p, size_t n)
{
dumbsort2(p, n, 1);
}
int
dumbsort2(int *p, size_t n, size_t i)
{
int tmp;
return i == n ? 0
: p[i] < p[i-1] ? (tmp = p[i],
p[i] = p[i-1],
p[i-1] = tmp,
dumbsort2(p, n, 1))
: dumbsort2(p, n, i+1);
}
> If the common case can be a single opcode like LOOP
The thing about the 8086 LOOP instruction is that it does expose the loop counter conveniently; it's just CX (or ECX on i386, RCX on amd64). So restarting it is just a MOV (not an XOR, because it counts down to 0).
Forth offers an instructive example of a set of operations (usually implemented in software rather than hardware) that support nested loops. Forth DO pushes both the loop counter and the upper bound on the "return stack" as what the ANSI standard calls a "do-sys"; there are words I, J, and K which get the innermost, second-innermost, and third-innermost loop counters, but cannot be used to set them. You can even save the contents of the return stack and restore it, but ANSI doesn't give you a way to restart the loop counter (which is certainly bad practice).
But that's just a matter of portability; any concrete Forth necessarily has to have a reliable way to set the loop counter so that LOOP can increment it. For CPUs, that circuitry wouldn't have to be accessible as an instruction, but general-purpose CPUs have a different concern: they need to have some way to save and restore all their architectural registers for preemptive context switching, which would include that loop counter — even if it wasn't exposed as a GPR like the i386 ECX. You could imagine providing that ability only via something like i386 PUSHAD, which would be a pretty inconvenient and slow way to restart the loop counter. But I'm not familiar with any instruction sets that work that way.
> Is there a way to write gnomesort that only has two simple conditionals? You're right that my suggestion brings it a lot closer to gnomesort, but I'm definitely not suggesting anything that checks if you're at zero.
I think that if you have a signed max() instruction that doesn't count as a conditional, this produces gnomesort with only two simple conditionals (untested):
void
browniesort(int *p, size_t n)
{
int tmp;
for (size_t i = 1; i < n; i++) {
if (p[i] < p[i-1]) tmp = p[i], p[i] = p[i-1], p[i-1] = tmp, i = max(i-2, 0);
}
}
I was thinking that maybe the "standard" gnome sort avoids redundantly comparing the first two entries again after swapping them, instead immediately stepping forward, but in fact both Dick Grune's C at https://dickgrune.com/Programs/gnomesort.html and the pseudocode in https://en.wikipedia.org/wiki/Gnome_sort do the same redundant comparison of the two items they just swapped. (Hamid Sarbazi-Azad's original "stupid sort" does not do this, but Grune considers it "the same algorithm".)
I think not counting max() as a conditional may be reasonable because, as I said, a number of CPU instruction sets provide it as a fundamental operation. At the hardware level it's no more conditional than the multiplexers used to read an operand from a source register in the register file.
I am not clear why the proof is so long. By induction, the first i elements are already smaller than the A[i], so it's essentially an unoptimised version of
It would make for a simple hardware implementation. More general than fixed size sorting networks. It if all happens in parallel you trade gate area for speed.
It is simply an inefficient bubble sort. A bubble sort makes an improvement by not comparing values that are already sorted. Eons ago this example was a common way to introduce making algorithms more efficient. Naive Implementation -> Better Implementation.
// A function to implement bubble sort
void bubbleSort(int arr[], int n)
{
int i, j;
for (i = 0; i < n-1; i++)
// Last i elements are already in place
for (j = 0; j < n-i-1; j++)
if (arr[j] > arr[j+1])
swap(&arr[j], &arr[j+1]);
}
If you read elsewhere in this discussion you will see many people giving reasons why this is not simply an inefficient Bubble Sort. You may disagree with them, as others do, but the situation seems rather more complicated than just saying "Oh, it's just a Bubble Sort."
The only thing surprising about this, is how bad many people's education is. If you don't learn a slow way to do something, how can you know if a fast way is fast? If you don't try to solve basic problems for yourself, how can you expect to solve hard problems if you get to them? Entirely self-taught people will likely have learned this. Children before the age of computers "invented" this algorithm, and many others, while sorting sticks on the beach.
I think you misread the code, the comparison is the opposite of what you might expect, so a[i] is always the largest element, and it is not inserted into a[1..i-1]
No, GP is correct. At the beginning of an outer loop iteration, a[1..i-1] is sorted with a[i-1] the maximum element. The inner loop inserts a[i] into a[1..i-1] so that at the end of the iteration, a[1..i] is sorted with a[i] the max.
> Letme re-interpret what the i-th iteration of the algorithm does:
> 1. for j = 1..i-1, we're trying to "insert" a[i] into a[1..i-1]. Try simulating it yourself. I believe you will find out it's insertion sort.
> 2. for j = i..n, we're replacing a[i] with the smallest element among a[i..n].
That is, they're saying in (2) that after an iteration of the outer loop, the ith value will be the smallest of all the values that succeed it in the sequence. This is demonstrably false. It is always the maximum value of the sequence itself, regardless of its initial position relative to i.
The following Python code is used to print out the result of running the inner loop on an arbitrary sequence but without altering the original value (so it won't actually be sorted at the end). The point is to demonstrate that the maximum value always ends in the ith position:
def inner(a, i):
for j in range(len(a)):
if a[i] < a[j]:
a[i], a[j] = a[j], a[i]
print(a)
def outer(a):
for i in range(len(a)):
inner(a.copy(), i)
a = [5,4,3,2,1,6]
outer(a)
[6, 4, 3, 2, 1, 5]
[4, 6, 3, 2, 1, 5]
[3, 4, 6, 2, 1, 5]
[2, 4, 3, 6, 1, 5]
[1, 4, 3, 2, 6, 5]
[5, 4, 3, 2, 1, 6]
Note that 6 ends up in the ith position after finishing the inner loop. If we let it sort (so remove .copy()), after each iteration the maximum value is, again, always at the ith position:
GGP is correct that it's basically insertion sort otherwise, (1). To illustrate, the behavior if you don't compare every pair is the same as insertion sort:
Break the inner loop into three parts, j < i, j == i, j > i, and you can see it's insertion sort with extra steps.
for i = 1 to n do
- Inserts A[i] into A[1 to i-1], using the A[i]
- spot as a temp variable.
for j = 1 to i-1 do
if A[i] < A[j] then
swap A[i] and A[j]
- Does nothing.
for j = i do
if A[i] < A[j] then
swap A[i] and A[j]
- Permutes A[i to n], placing the maximum into
- A[i]. Since A[i] already has the max after the
- first iteration, it won't change, and we don't
- care how A[i+1 to n] is permuted since it gets
- sorted anyway.
for j = i+1 to n do
if A[i] < A[j] then
swap A[i] and A[j]
What you are seeing isn't speed but the iterations. If you see the source, I am drawing at certain location in the loop. Sometimes when swap happens, sometimes afterwards. More than speed you are seeing the steps.
Each draw step is one draw call. For example in some algos where there isn't a clear swap operation I draw the whole array in one step, which may or may not reflect all operations that happened before that draw call.
This looks interesting, but I can't get it to do anything. I see a purple gradient square, I see some widgets on the upper right. I see a box of code on the left. None of it seems to visualize a sort or even respond to input. What do I do?
It's a little confusing, but playing around with it I was able to understand it.
First of all, there's a Start button at the bottom of the right menu. Hit that and you will see that big color gradient sort itself.
However, that massive square is useless for seeing what's going on. So instead change the size to its lowest (8) and the delay to something like 400.
Now what you've got is 8 randomly-sorted columns. When you hit Start that algorithm will start sorting them each in parallel. You can just look at a single column and see the colors getting swapped. The darkest color moves to the top very quickly. This is a feature of how this particular algorithm works.
What's the point of the big square? Well, for one thing it's an interesting way to compare algorithms. Go back to the settings for a large number, a very low delay, and check off two other sorting algorithms, say Comb Sort and Heap sort. Hit Start and you'll see three very different approaches to sorting the columns.
Viewing this as an "algorithm" may be why people find it to be surprising*. I don't find it terribly surprising, because I've played with a lot of sorting networks. Viewed as a sorting network, it's really clear how and why the algorithm works (and where it wastes half of its time).
* though, edflsafoiewq's analysis is nice and tidy
At first it's entirely un-intuitive and obviously slow, but I kind of like the elegance, and the use of "<" makes sense once you realize that the final sweep will be with an i at the end of the array so that j will be ahead of it.
Bubble sort only swaps adjacent pairs (thus "bubble", visualized the values "bubble up" to their correct spot) and is often abbreviated so that the inner loop runs on shorter segments (because the smallest or largest value, depending on how you write it, has been placed, no need to check that section again). this algorithm will swap non-adjacent pairs, making it more like insertion sort.
Nearly all meaning the non-scientific computing ones, and particularly the ones with syntax inspired by C. Fortran, APL, Matlab, Wolfram, Mathematica, R, Julia, Cobol, Smalltalk and Lua are all 1-based, at least by default.
Huh. I thought it was obvious: e.g., the top-left corner of the screen is (0, 0), the bottom-right corner is (640, 480), and pixels are 1x1 — we take the coordinate of their top-left corner to be their coordinate, so the top-left pixel has coordinates (0, 0), and the bottom-right one has (639, 479), and obviously the coordinates of their centres are (0.5, 0.5) and (639.5, 479.5).
But apparently some people treated pixels as geometrical points without size?
It's normal in signal processing to treat samples (whether audio or image) as points. A lot of things stop making sense if you replace them with line segments and rectangles.
Welcoming open, thoughtful discussion and having people much smarter and with more experience than myself available to answer questions in which I'm not an expert is one of the main reasons I frequent HN.
No, bubble sort uses a single index to compare items that are next to each other, and also has the condition reversed (> vs <), which is part of the surprise that this sort still has the correct increasing order.
If you compare only adjacent pairs, you eliminate an enormous number of redundant comparisons.
This is bubble sort without the redundancy elimination. It's just that the elimination is so common that many people don't know about it anymore, and think it's a defining part of the algorithm.
I'm not sure what you mean here, but it seems like you're saying that bubble sort has every pair compared with every other pair, and that's not the case. A defining feature of Bubble Sort is that it only ever compares adjacent elements.
The sort presented here is definitely not Bubble Sort.
I did. Really, I did. I just happen to disagree with you.
Now, it might be that you have more information than I have, and
that would be interesting, but based on what you've said, I still
think the algorithm presented here is genuinely not Bubble Sort.
You said:
> the elimination is so common that many people don't know about it anymore, and think it's a defining part of the algorithm.
Perhaps I should have emphasised the "is", and said:
>> A defining feature of Bubble Sort is that it only ever compares adjacent elements.
That might have served to emphasise that I had read your comment
and was explicitly disagreeing with you. You claim otherwise ...
perhaps I should have asked for an explicit reference from you to
support your position, and provided for you the Wikipedia code to
support mine.
In particular, on Wikipedia it says:
procedure bubbleSort(A : list of sortable items)
n := length(A)
repeat
swapped := false
for i := 1 to n-1 inclusive do
/* if this pair is out of order */
if A[i-1] > A[i] then
/* swap them and remember
something changed */
swap(A[i-1], A[i])
swapped := true
end if
end for
until not swapped
end procedure
So taking this as the definition, it explicitly only ever compares
adjacent elements of the array. And that's what I said. It might
be that it has derived from a version that does more comparisons,
and I'd be interested to see evidence for that, but this seems to
be the definition of what we call Bubble Sort.
More, the algorithm in the linked paper appears at first glance to
have the comparison the wrong way round. That's really different
from the Bubble Sort that I know. It also explicitly performs n^2
comparisons, and Bubble Sort performs n-1 on each pass, and when
it runs the full length every time, it performs (n-1)^2 comparisons
at most. It really feels very different.
"Bubble sort is a sorting algorithm that works by repeatedly stepping through lists that need to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order." -- https://www.techopedia.com/definition/3757/bubble-sort
I'm really baffled by your claim. I've never seen anything other than the definition or implementation saying that it's adjacent elements being compared. Everywhere I've ever seen it's been a defining feature of the algorithm definition ... yet you claim otherwise.
Can you provide any explicit reference where the Bubble Sort is not explicitly comparing adjacent elements?
This is boring. I'm sorry you're unable to read my first comment, and keep requesting reference you've already recieved.
Notably, you're even providing scans from the TAoCP page I gave you, which have the relevant text carefully trimmed away, and are showing individual paragraphs whose text does not attempt to give a definition, despite that a definition is present on those pages
Who are you trying to convince, Frank? Nobody's reading this thread this late. Did you think I just didn't know what else was on the pages I gave you?
When an anti-vaxxer wants to make a point, they have to turn to hilariously inappropriate sources to pretend they're defended.
Techopedia, tutorials point, BBC news, a 2008 tutorial written by a student, and a random set of tutorials on a page full of broken links and misspellings. And very, very carefully edited Knuth.
Amusingly, one of them even explicitly spells out the same thing I did.
I see you're turning to the best in the business. And ignoring the coiner, again.
I read your comment ... you are calling me a liar, and I don't appreciate that.
I'm providing scans of exactly the part where Knuth says the Bubble Sort only compares adjacent locations, just as I said.
You said:
> Bubble sort is the comparison of every pair.
This is contradicted by the scans of TAoCP that I've provided.
You said:
> If you compare only adjacent pairs, you eliminate an enormous number of redundant comparisons.
That's true, but you are eliminating redundant comparisons from something that is not a Bubble Sort.
> This is bubble sort without the redundancy elimination.
This is contradicted by the scanned excerpts from TAoCP.
> It's just that the elimination is so common that many people don't know about it anymore, and think it's a defining part of the algorithm.
The comparisons between locations that are not adjacent was never part of the Bubble Sort. You continue to claim otherwise, I have checked your references, as far as I can see they don't support your assertion. Please provide scans to support your claim.
Bubble sort scans the full array repeatedly until finished, but only along the pairwise edge, and usually finishes way early
So in an array of 20 elements you have a maximum of 380 comparisons
This searches Euler's Little Triangle the whole way every time, so in an array of 20 elements, you have a maximum of 380 comparisons again
Seems similar, right?
Except doing pairwise you're likely to exit without doing all 19 scans - at the logarithm if it's well distributed - so it probably does either 4 or 5 scans, saving ~80% of the work
The issue isn't the ceiling; it's the common floor
Doing it the seemingly-normal way radically changes the common floor
So not only does this do different comparisons, it "radically" changes the number of "common floor" work.
And by the way, there's something wrong with your analysis of the number of scans needed for the 20 element bubble sort. I don't understand your argument, but I did some experiments with a randomly distributed array. I never saw it sort in fewer than 12 scans, after a few dozen trials.
Looks just like a slightly smarter insertion sorter, which is a slightly smarter exchange sorter, which is a slightly smarter bubble sorter. I don't see what's surprising about the approach.
If you read elsewhere in the discussion you will see many people explaining why this is not a Bubble Sort. One of the defining characteristics of Bubble Sort, as defined in TAoCP and many, many other places, is that it's only ever the elements in adjacent locations that are compared.
Yes, but the way to traverse the array is the same as Bubble Sort.
With that so specific definitions, I could just get any other sorting algorithm and change it with different ways of comparing the elements to create "new" sorting algorithms.
The second loop runs from 1 to n, whereas in Bubble Sort it runs from 1 to n-1.
The comparison here is between A[i] and A[j], whereas in Bubble Sort it's always only between A[i] and A[i+1].
This one always performs n^2 comparisons, whereas Bubble Sort never performs n^2 comparisons.
This one compares elements to themselves, whereas Bubble Sort never compares and element with itself.
If you sort the array [1,2,3], the first thing this routine does is swap the 1 and 2, giving [2,1,3], whereas Bubble Sort runs along the array, makes no swaps, and exits.
This is not Bubble Sort.
It's true that if you take an existing sorting algorithm and change things around you can get other sorting algorithms. The question is then one of just how different it is.
If you choose to believe this is "just Bubble Sort" then that's up to you, but I disagree.
I've started using "the royal we" in my company's internal wiki so that hopefully readers won't think too hard about who the author was. In my case, I'm worried about new employees thinking, "Oh this is Jelly's page, I can't edit it."
Exactly, it’s less of a royal we and more of an inclusive / humbling we: I might be the author but I’m writing on behalf of and under the aegis of the group / organisation, so it’s our content not mine alone.
It's intentional and normal. Academic papers traditionally write in plural form (regardless of co-authors) as it feels more inclusive towards the reader.
This is common in some fields. My understanding of it is that 'we' means 'the author(s) and the reader'. You'll sometimes see this voice changing when the author(s) give their own opinion. e.g.
> From this we can see that XYZ. It is the author's opinion that ABC.
They likely had at least one research assistant, proofreaders, etc. Maybe the author is attempting to distribute credit and lead the reader to infer there was just more than one person involved.
No. Many scientific papers are written in third person. "We" actually means "me or us the author(s) and you the reader who is following along and checking these details"
Or, yes. I worked as a research assistant gathering data for a professor at University who wrote 100s of papers in the area of graphics and computer vision.
During college junior year (1993) as a physics major I took a class in digital electronics (which included 68000 assembler programming). We had a lab contest to create the fastest sorting routine for a set of random unique numbers between x and y.
I won the contest by setting to 0 a y-x register range of memory and then inserting each number into the range based on the number itself ("27" into register 27, "18" into register 18, etc.). Then I printed out all non-zero numbers in the range.
The other 20 or so students did versions of bubble-sorting and called my solution a cheat. The professor defended my victory as I had not broken any of the rules of the contest...