Hacker News new | past | comments | ask | show | jobs | submit login

> Maybe it’s because CS curricula tends to teach searching/sorting algorithms before data structures, specifically BSTs.

I think it's more because sorting is all about "efficiency", and sorting things without allocating a ton of intermediate data structures is viewed as more efficient. The blog does mention the deforestation optimization, which (if implemented sufficiently heroically) would remove the intermediate tree in the QuickSort example. But now we're straying pretty far from a simple sorting algorithm in a first high-level algorithms course to a more complex and less abstract beast that is only maximally efficient if you assume a heroic compiler.

As a side note, heap sort (https://en.wikipedia.org/wiki/Heapsort) does perform sorting by constructing an intermediate heap data structure. It's just that the data structure is implicit as it is built in-place in the array to be sorted.

> If you think about it a BST defines order and a sorting algorithm achieves it so naturally the output of a sort can be seen as a BST (flattening it to a list is somewhat irrelevant).

Or, similarly, the construction of a BST is a sort: https://en.wikipedia.org/wiki/Tree_sort




> I think it's more because sorting is all about "efficiency", and sorting things without allocating a ton of intermediate data structures is viewed as more efficient.

I agree that probably explains a lot of it.

However the cost to efficiency is not that large and may be worth the gains in conceptual clarity. First, the traditional implementation of QuickSort has an implicit (or virtual?) intermediate data structure, the function call tree. Second, I have limited understanding of lazy evaluation, but in a lazy language I believe ‘flatten’ would begin consuming the tree as it is still being built by ‘build’.


> First, the traditional implementation of QuickSort has an implicit (or virtual?) intermediate data structure, the function call tree.

That's a good point, but function calls have a constant cost in terms of stack frame allocation and deallocation.

> Second, I have limited understanding of lazy evaluation, but in a lazy language I believe ‘flatten’ would begin consuming the tree as it is still being built by ‘build’.

Yes, that is correct. The entire tree is not allocated at once, only the nodes that are actually needed, as well as dynamic data representing suspended computations of the rest of the data structure. But (again, ignoring compiler optimizations that would make them go away) these are allocations on the garbage-collected heap and thus incur GC costs and complexity.

Overall I agree that there would be value in illustrating algorithms by also showing the version that constructs the computation tree. But I don't think it should be the one presentation of the algorithm, only a device to further understanding of the "real" implementation.


Yes, definitely. For example, the explanation of MergeSort in CLRS’s “Introduction to Algorithms” has a figure which shows tree a of lists being transformed into another shorter tree of lists where the lower nodes have been merged.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: