One interesting note on this approach is that the author suggests using a "fast" sorting algorithm like mergesort/quicksort as the sorting algorithm for best performance. But in practice, you may get better performance from a "worse" sorting algorithm: insertion sort.
The reason is that objects in collision detection systems tend to move relatively small steps between frames, meaning you can keep lists from the previous frame that are already mostly sorted. For sorting these mostly sorted lists, insertion sort tends towards O(n) while Quicksort tends towards O(n^2)!
If you use a modern sort, it'll just go "Oh that's sorted, we're done". Pattern Defeating Quicksort will do that and so would Glidesort, Driftsort, IPNsort and so on. Most of these algorithms will also correctly shortcut trivial "almost sorted" cases like 123457689 (only longer, if you really only have nine elements you want a custom sort for that, there will be a correct algorithm and it's probably written down somewhere already)
Rust's old sorts, both of them, get this right. There's a fair chance the current version of your C++ stdlib even gets this right in its unstable sort. In 1965 it's understandable that you reach for an insertion sort for this case, in 2024 it's embarrassing if your language doesn't provide a built-in sort where this just works.
You don’t just want to sort the data, you actually want to collect a list of overlapping pairs, (or more like you want to collect the list of pairs that have changed since last time).
Using a built in sort will give you the same complexity, but iterating over 10000 elements again to gather this list is an overhead that you can avoid with a DIY approach
There is an alternative to sorting every step : Build your indexing structure a little looser, so that you catch the candidates collisions when object have moved less than epsilon.
For example that can be done by increasing the radius of the spheres by epsilon.
As long as the spheres have not moved by epsilon, you don't need to recompute the index.
To avoid latency peaks when you do need to recompute the index, you can start building a lagging index by sorting 10% each frame (aka amortizing the computation cost). After 10 frames you have an index that is valid as long as the position is within epsilon of the position 10 frames ago.
> For sorting these mostly sorted lists, insertion sort tends towards O(n) while Quicksort tends towards O(n^2)!
Only if you pick your pivot really, really badly. You can randomise your pivot to get O(n log n), or in the case of already almost sorted lists, you can pick a pivot by just picking the element in the middle of the list.
But you are right, that even with optimal pivots, QuickSort is still O(n log n) even in the best case. There are simple variants of merge sort that give you O(n log k) behaviour where k is the number of runs (both ascending and descending) in the data. The 'sort' you get by default in eg Haskell's standard library uses such an algorithm. I think Python does, too.
The reason is that objects in collision detection systems tend to move relatively small steps between frames, meaning you can keep lists from the previous frame that are already mostly sorted. For sorting these mostly sorted lists, insertion sort tends towards O(n) while Quicksort tends towards O(n^2)!