I'm curious as to why it isn't implemented in hardware. Is it really so rare to need to sort floats, or so common to need a different ordering when you do?
Of course sorting floats happens a lot. In practice one rarely encounters NaN's and ±Inf's, so fast comparison for concrete values is the default. I don't know why the 'slow' total order is not implemented in hardware though.
But fortunately in comparison sort algorithms that run in O(n lg n) you can get away with doing an O(n) partitioning of the array into [-, +, NaN] and then applying a fast integer comparison operator to the negative values (-) and positive values (+).
In fact the above idea ties in neatly with QuickSort, which is already based on partitioning & sorting recursively.
Yes, the literal < in the language induces a partial order by convention. What I'm getting at in my comment is that you can define a sensible total ordering.
The challenge becomes what code to emit when you see those operators: native target comparisons, or the software implementation of your total ordering? The latter is safe and slow and the former is fast and IMO idiomatic.
So since no one needs this often enough to emit the soft-float comparison code, we should emit the fast code. If folks need different behavior they should use different types. This is similar to the behavior with integer overflow, which you can opt into by using checked types or checked operations. Though in rust we have a convenience that the overflow-detecting code is emitted for debug targets.
It is possible to do redefine NaN as something different than what IEEE 754 contains, but it will surprising to some users, and it will come at a performance cost because you can no longer let the hardware handle all float comparisons directly.