Hacker News new | past | comments | ask | show | jobs | submit login
Comparisons in C++20 (brevzin.github.io)
183 points by ingve on July 28, 2019 | hide | past | favorite | 130 comments



One of the most frustrating bugs I encountered in a previous job was an incorrect implementation of a “strict weak ordering” that would cause containers and algorithms to just plain misbehave. The evil part was that there was a whole range of misbehavior, including “generally working” for most data, while crashing in other cases!

The reality is that most programmers are not experts but they exist on C++ projects. People will try “simple” things like hacking up a less-than operator, and when it seems to “work”, they move on (leaving a code time bomb behind).

The C++ compiler needs to provide robust compile-time and run-time checks to tell programmers that things are wrong, like: “your operator does not meet the requirements of the sort algorithm”.

Contrast the latest 20-page description of std::whatever_the_heck to, say, Python sorting, which offers "key=..." because the overwhelmingly common case is to simply state a single field on which to base object ordering!!!


Python is almost as bad as C++ when it comes to making custom classes sortable. In Python 2 there was had __cmp__, which was roughly the same as C++'s new spaceship operator, but Python 3 inexplicably removed it in favor of only having individual comparison operators like __eq__, __lt__, __gt__, etc. Oh, and functools.total_ordering, which lets you "only" define __lt__ and __eq__ and get the rest automatically implemented, but that's still two things to define when __cmp__ was only one.


IIRC the reason is that people would regularly screw up __cmp__ (despite the cmp built-in being available), it’s harder to fuck up simple boolean results that the any-integer-coerced-to-3-state of cmp.

Python 3 also removed the “everything is comparable” misfeature of P2, providing way lower incentive to implement ordering: if your type is not orderable, a user knows to go with key functions.


I don't trust someone who can't implement `__cmp__` to implement `__lt__` either.


Experience says you're way wrong, simply because combining a sequence of sub-lt calls is way simpler and shorter than combining a sequence of cmp:

    def __lt__(self, other):
        return self.a < other.a and self.b < other.b and self.c < other.c
meanwhile

    def __cmp__(self, other):
        v = cmp(self.a, other.a)
        if v != 0:
            return v
        v = cmp(self.b, other.b)
        if v != 0:
            return v
        v = cmp(self.c, other.c)
        if v != 0:
            return v
        return 0
Rust (for instance) makes the latter less offensive (and error-prone) by providing built-in combinators: https://doc.rust-lang.org/std/cmp/enum.Ordering.html

    impl Ord for Thing {
        fn cmp(&self, other: &Self) -> Ordering {
            self.a.cmp(&self.a)
                .then(self.b.cmp(&other.b))
                .then(self.c.cmp(&other.c))
        }
    }


But that first comparison function is usually not what you want, you want lexicographic ordering. This very example is in fact discussed in the article, as an example to how people often mess this up...


> But that first comparison function is usually not what you want

All those functions implement the same comparison, the first is simply not including the similarly trivial implementation for eq:

    def __eq__(self, other):
        return self.a == other.a and self.b == other.b and self.c == other.c
> you want lexicographic ordering

Every word here makes sense but the objection makes none. What are you trying to say exactly?


The issue is that __cmp__ wasn't "only one". For one thing, you generally want equality independent of ordering, so defining an explicit __eq__ that just handles equality is a decent idea. Otherwise either you define both cmp and eq, or you have your cmp including stuff like isinstance checks, in some cases, and having to decide if your custom class is more, or less, than a string (as masklinn notes below). Note: there's literally only bad answers here. Does cmp raise an exception? Yikes. Is MyCustomObject less than "hello world", or more, or does it depend on the internal state of MyCustomObject? Yikes, why do I have to choose? MyCustomObject isn't like "hello world".

So you want eq, seperate from cmp. Then you still need a way to define ordering, so either write them all for weird cases, or just pick one and have total_ordering do the legwork.


So why not just define `__eq__` and `__cmp__`. It just seems to make so much more sense... and only `__eq__` only if you need it different from `__cmp__ == 0`.

Removing `__cmp__` was a step backwards IMHO.


Because requiring you to define eq is more correct. Equality and ordering are different concepts. If they wanted to keep cmp, the only correct choice would be to require eq if you define cmp.

This again goes back to what masklinn said: a total ordering across all objects in python is a misfeature. The idea that `object() < 13 < "hello world" < 3+2j < [(frozenset(), frozenset())] < {type: int}` should evaluate to anything but a TypeError is horrifying!

And once you make that decision, if an object implements cmp, it must implement eq (because the safe thing should be the default, and allowing people to implement cmp in isolation lets them shoot themselves in the foot). And at that point, cmp is more work to write than le or gt, so instead just implement eq, and then if you want it, le or gt + total_ordering, or if you need weird things implement each method individually.

In practice, I've also found it much easier to reason about eq or le than cmp. People like to be tricky with cmp, and even when they don't, returning -1 or 1 is less explicit than just returning True or False.


> The idea that `object() < 13 < "hello world" < 3+2j < [(frozenset(), frozenset())] < {type: int}` should evaluate to anything but a TypeError is horrifying!

At least it evaluates to False.


Which means, that if you flip some of those operators, it will be True... Won't make sense though.


So Python has reverted to the pre-C++20 state but without the ability to redefine other operators? Can you even make a proper custom float type in Python then?


You can define all the operators explicitly if you wish. However, the common case is made easier, define only __eq__ and one of the others (__gt__ or __le__ usually) and @functools.total_ordering will derive all of the others for you.


Okay, it's exactly the pre-C++20 state. That's actually funny.


I don't think that's true, python had no concept of strong vs. weak equality and ordering. cmp returned either -1, 0, or 1 (or at least, it was treated that way). There was no way to describe "unequal, but not orderable".

That's the flaw which cpp avoids, although it remains to be seen if implementing complex spaceships is worth it.

Or I guess, python is now in the pre-cpp20 state, but was never in the post-cpp20 state, so it didn't revert. Python's cmp was broken.


What's not true?

I didn't say that Python was in post-cpp20. Maybe you wanted to comment on my previous comment.


To rewind a bit, you stated "So Python has reverted to the pre-C++20 state..."

Which is the statement I originally took issue with.


> C++ compiler needs to provide robust compile-time and run-time checks to tell programmers that things are wrong

Hard to do without breaking backward compatibility, which is among the main reasons why we still write C++.

It’s done in C#. CS0216 forces you to implement operators in sets, e.g. can’t implement `==` without also implementing `!=` it won’t build otherwise: https://docs.microsoft.com/en-us/dotnet/csharp/misc/cs0216

And these warnings are printed for compatibility with containers: https://docs.microsoft.com/en-us/dotnet/csharp/misc/cs0660 https://docs.microsoft.com/en-us/dotnet/csharp/misc/cs0661


> CS0216 forces you to implement operators in sets, e.g. can’t implement `==` without also implementing `!=` it won’t build otherwise

So… that you can provide incoherent implementations of == and != instead of the system just generating one based on the other?


They don’t have to return bool.

Even when they return bool, in some cases both greater and less must return false. Example: https://en.wikipedia.org/wiki/NaN#Comparison_with_NaN


I'm not saying you shouldn't be able to implement both such that they're not coherent, I'm saying requiring implementing both when the system could provide one based on the other (as e.g. Haskell, Rust or Python do) is an unforced error and a pointless footgun.


> unforced error and a pointless footgun

I think it’s the other way. Providing one based on the other is a pointless footgun.

The experience doesn’t end when you’ve made machine code from your source. Developers debug stuff. Developers support their products, sometimes analyzing crash dumps. The more magic is used in the compiler, the harder it is to debug and support software.

Sometimes that’s justified, e.g. in C# there’s substantial amount of compiler magic for generators and async functions. But these 2 features save substantial amount of code complexity.

Implementing != from == and > from < only provides minimal profit. The manual implementation is a single line method.

Another reason, C# is different language than Rust or Python. It has awesome support for OOP and runtime reflection. What should happen if you use reflection to get op_Inequality for a class where only == is defined? What should happen if you define == in a base class and != in a derived one?


>> C++ compiler needs to provide robust compile-time and run-time checks to tell programmers that things are wrong

> Hard to do without breaking backward compatibility, which is among the main reasons why we still write C++.

Although the language specification may need to preserve backwards compatibility, compilers can offer strictness warnings that go beyond that (e.g. -Wstrict). And the standards have deprecated and removed things as well. It’s not a straightjacket.


For the last 20 years, C++ standard library uses `operator <` for std::sort, and for building red black trees like std::map and std::set; `operator >` is not used.

Making this a warning would probably break too much existing code.


Well, you can use whatever extremities of strictness and year of standard for new code (mine is currently c++17 + pretty much every possible warning). And you can keep lower levels of warning for older, or third party code.

In the case of the standard library, the whole library is being hoisted to use <=> in c++ 20. Old code will still work with the latest library of course.


> And you can keep lower levels of warning for older, or third party code.

Maybe, but I think it’s hard to do in practice. Many C++ libraries are header only because templates, i.e. old and new stuff is mixed really well.

Differentiating warnings is possible with #pragma but they aren’t portable across compilers.


What we do is make a small shim to a more modern interface and then compile the shim, with the includes, in c++11 mode with whatever warnings disabled as necessary.

I know this doesn't work for everyone and we can't even do it with everything. But it helps a lot.


> The C++ compiler needs to provide robust compile-time and run-time checks to tell programmers that things are wrong, like: “your operator does not meet the requirements of the sort algorithm”.

MSVC does that in debug mode - it will assert if you use an invalid comparator for e.g. map / set. No reason it couldn't be further improved, or a clang-tidy check written for that.


MSVC also does interator validation and bounds checking in debug mode, alongside its macro based contracts.

Some things are actually more productive in MS land for security conscious coding. A nice side effect from all those Windows 9X exploits.


It has saved my behind a handful of times. Also shoutout to unordered_map for the debug build runtime check that equal keys have equal hashes.


(just by pure curiosity, are you Vesa Norilo ? if so, great work on Kronos !)


That's me! Thank you, much appreciated.


I do not think it is trivial to prove at compile time that an a comparison operator is correct: I think you need a fancy type system for that. Asserting correctness at runtime is of course possible, but it has non trivial cost. Most std libraries provide a debug mode that will catch these sort of issues.


To enforce correctness, you'd need to demand proofs. I think the most common approach for this is dependent types. I think you'll have to wait another 20 years before they become mainstream.


I'm working on a game. I have a templated 2D vector implementation. Tile positions use Vector2<int>, entity positions and such use Vector2<float>.

I wanted a map from tile position to something else. I found that I had to implement operator<, because std::map uses operator<. That's an issue, because:

1. I didn't want to implement every single possible comparison operator just to put vectors in an array, but also didn't want to arbitrarily make `foo < bar` legal but `foo > bar` illegal.

2. What does it even mean for one vector to be "bigger than" another vector? For vector math purposes, I'd maybe expect operator< to compare magnitudes, but that would be absolutely terrible for std::map. Not everything has an unambiguous concept of "smaller than".

In the end I just made a std::map<std::pair<int, int>> and added an implicit conversion from Vector2<T> to std::pair<T, T>.


Oh, that is simple: in associative containers, you can pass an external comparator as additional template parameter. Use that. Don't implicitly convert, it will bite you.


Oh thanks, I somehow didn't see that. That does sound better yeah.


And also see if you're better off with an std::unordered_map or a custom sorted-array implementation of a map.


In addition to the sibling’s comment on explicitly passing inna comparator, note that std::map is an ordered dictionary and requires an ordering on your elements. It sounds like you may have preferred hash_map


I don't know, the way std::pair works is that it compares the two elements' `first`, or their `second` if `first` happens to be equal. That seems pretty good; lookup should still be O(n).

I'll keep hash_map in mind though, and maybe do some performance testing to see which is actually faster.


std::map is incredibly slow (and allocates willy-nilly), see https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-01-o... for a comparison of modern hash maps. Or at least boost::flat_map if you need ordering but don't do many insertions / removals and care instead for fast iteration on the map.

Note: when I say "incredibly slow", it is relative to what is possible to achieve in C++ - it will still roll over many other languages's map implementations.


Chromium advised for their typical use cases to use std::map in favor of std::unordered_map. And they also provide own version of flat_map.

[1] - https://chromium.googlesource.com/chromium/src/+/master/base...


In some cases, the performance of std::unordered_map is fixable with a custom allocator: https://github.com/Const-me/CollectionMicrobench


Interesting to compare to Haskell. Here you typically implement “compare” which is equivalent to the spaceship operator and get < &c for free, but you can implement “<=“ and get compare for free because the defaulting mechanism is general and not built into the compiler. Also “Ord” has to be compatible with “Eq” (which you can also get for free) which is different from C++’s approach. Moreover, the compiler can “derive” implementations for regular data objects.

C# is slightly different: all of the algorithms take an IComparison, which implements the spaceship operator in another class (you can implement the related IComparer in your actual class and it all works via the magic of reflection. This isn't too bad since the reflection only occurs in the constructor.). Comparison operators are basically never used in the standard library and people rarely implement them.


> Moreover, the compiler can “derive” implementations for regular data objects.

When defaulting the spaceship operator, the same happens for C++ (by comparing members). In this special case, the equality operator is also effectively defaulted, so as described in the article

    struct A {
      …
      auto operator<=>(A const& rhs) const = default;
    };
gives you automatic member-wise comparison "derived" by the compiler.


Nice, I missed that.


> Importantly, there is no language transformation which rewrites one kind of operator (i.e. Equality or Ordering) in terms of a different kind of operator. The columns are strictly separate.

Why do we need a primary == operator if we have strong_ordering::equal, weak_ordering::equivalent, and partial_ordering::equivalent? Couldn't the behavior of == and != be inferred from the <=> definition?

I guess I'm asking why a == b can't be rewritten as (a <=> b) == 0.


> Why do we need a primary == operator if we have strong_ordering::equal, weak_ordering::equivalent

Weak orderings are defined as orders that can't distinguish between all members, which is why it is called "equivalent" instead of "equal".


> Weak orderings are defined as orders that can't distinguish between all members

I get that. This wasn't my question.

I want to know why I need to define operator== when reasonable behavior can be inferred from whether or not operator<=> returns strong_ordering::equal (or weak_ordering::equivalent -- the standards committee can decide if this is reasonable -- I don't really care). If I want special behavior, then sure, defining operator== might make sense, and then it should obviously take precedence.

But if the whole point of this new three way compare is to reduce the combinatorial explosion of things that you need to define, I don't understand the need to split the universe into {==, !=} and {<=>, <, <=, >, >=}, and never let them interact with each other.

The part I take issue with is the statement "The columns are strictly separate."

Maybe I've missed something, but I don't see why it needs to be like this. operator<=> seems to be a strict superset of operator==, because it returns information about when equality holds, (or when equivalence holds). Shouldn't that be sufficient? Also, what's going to break if you design a type where a.operator==(b) doesn't return true when a.operator<=>(b) returns strong_ordering::equal, or vice-versa?

Again, maybe I've missed something, but this seems like a mistake. They're removing all of the footguns except this last one. Why?


> Again, maybe I've missed something, but this seems like a mistake. They're removing all of the footguns except this last one. Why?

originally they weren't separated, but the risk for bad performance due to that was too high. The complete rationale is described here : http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p118...


Thank you! This answers my original question. I guess all things considered, it's still not too bad. Two functions is better than 6+, and it looks like they'll be a lot harder to get wrong.


> and it looks like they'll be a lot harder to get wrong.

especially considering you can ` = default;` them, which should be enough a lot of times.


Some types might implement == but not <=>. For these types, == needs to be a primary operator.

But for types that implement <=>, I'm not sure why == isn't automagically generated.


There is an explanation in the article - basically, this is done for performance reasons.

Especially for large types (e.g. collections) in-equality can short-circuit very quickly, whereas comparison may need to see a large part of the collection before being able to return.

The most pathological example is two large arrays of different size, that are element-by-element equal for the first million elements. Equality can return false after seeing the size difference, but <=> needs to reach the millionth-and-1 element before being bale to return a value.

The commission was especially concerned because the two operators would be functionally equivalent - so, it would be difficult to rely on testing to make sure the equality operator is also implemented.


The blogpost says that == is needed while the paper doesn't.

I'd say having == is more performant for strings.


Yeah, I just skimmed through the proposal, and it sounds like the post is wrong, and it works the way I think it should work.

You should get == behavior if you only define a reasonable <=>, and if you also define ==, then that definition will take precedence.

Edit: argh, no, I'm wrong. I was looking at an earlier draft. So you do have to define == as well as <=>, for mostly technical reasons. Fine, I guess that answers my question.


It's basically for performance reasons, as I understand it, not technical limitations or something like that.


It's funny how Concepts are not mentioned at all. Given that they are the C++ equivalent of type classes and a common example is the "EqualityConstraint", I can't believe there isn't some interplay. Concepts are also in C++20.


I've thought for years that C++ was a sprawling mess of a language with a good and useful subset buried in there somewhere. I'm giving up with C++20. It's time to acknowledge that the language is too big and its virtues too small.


This non-sequitur of a reply is getting pretty tired on C++ articles.

Perhaps you could explain how this feature is a sprawling mess exactly?

In my opinion, not only is the feature easy to understand for beginners, but it also :

1) Simplifies code - less code to write means less code to have bugs

2) Prevents people making mistakes - incompatibly defined comparison operators wont happen with <=>

3) Gives a speed boost - in some algorithms it's faster to do a<=>b rather than a < b and b > a.

You could find a deep dive article just as long about the minute details of comparison for practically any language.


I agree. I'm usually the first to shudder at C++'s insane complexity (and the fact that they keep piling more on top with every new standard) but this change feels rather natural, intuitive and a significant quality of life improvement.

Sure, it adds the `<=>` operator but I think once you've been explained once what it does it's not to hard to reason about it. That still leaves the problem of technical debt of course: for the decade to come C++ devs will have to deal with both the legacy and the modern version of comparison operators and they'll have to learn to understand and deal with both.


Except, when I define templates code I need to handle classes that haven't defined <=>, and it isn't automatically created. This is, to me, the biggest problem with c++. Some stuff would be great if it had been there from the start, or you are working a code base. Which can require all code (including all libraries) make full use of all modern c++ features. But new feature, in my experience, never stop you having to handle older features.


I didn't say <=> was a sprawling mess. I said C++ is a sprawling mess, and it surely is: C++20 will be at least as big a change as C++11, and the feature laundry list seems determined to grow.

That's great for people who work on a codebase, but makes bridging the gaps between C++ codebases harder with every passing year. And make no mistake: I 100% guarantee there will be bugs arising from the mixed use of <=> and traditional operators.


Well, if you really want to be convinced how hairy C++ has become, just have a look at the implementation of the Boost libraries.


With every revision of the C++ standard Boost becomes both less hairy and less necessary.

The direction of the language is towards greater cleanliness and consistency.


Yes, but it shows that the problem still exists.


Maybe it's because you're not not a developer who would gain by working with C++. If you don't need performance, speed, or low memory footprint, I can easily say that you don't need this language, so I don't think you can understand why this language is how it is. Just use something else.

But you can also be a little curious as of why some industries are using C++, because if you look hard, there are not that many language like C++. Rust is an exception, sure, but it's very recent and it's not like Rust can pretend to replace everything that is done with C++.


>Rust can pretend to replace everything that is done with C++.

Do you mean in term of "you can't replace decades of base code in a single day", or do you mean that there are – according to you – scope where C++ is definitely better than Rust?


I'm a big Rust fan, but right now, Rust is still missing a few important features that C++ can offer.

Most of those are in progress and will materialize eventually.

* const generics: landing relatively soon but still not there

* const/constexpr: support is getting better , but is still very limited right now

* lack of higher kinded types prevents things that are quite easy to do with templates. The GAT (generic associated types) is being worked on but nowhere near completion

* obviously the library ecosystem is tiny and immature compared to what C++ has to offer

Feature wise I believe that Rust will be on somewhat equal footing to C++ in a few years, at which point I won't see any reason to use C++ over Rust, except if you rely on certain libraries - or in a domain where Rusts trait system is just inferior to classes + inheritance.


What a presumption.

I do embedded security. For context, that means I use C++ when raw performance isn't that important-- because if it were, it would be in verilog and baked into the silicon.


It seems to be a matter of propper balance. You can elevate patterns, idioms etc. to language features and you can add entirely new features that enable entirely new solutions. Java and Python are doing this sensibly. But burying devs in an avalanche of new language features that change the way you are supposed to do fundamental things is going to make it much harder to keep up. Especially when this happens every 3 years.


> Java and Python are doing this sensibly.

IMO this is because Java and python are driven by open “community processes” where anyone can influence the language. C++ standards are influenced by full members of some kind of iso national bodies, from what I remember.


Actually recently the C++ standardisation has become significantly more open. Anyone can submit papers and attend meetings (attendance has skyrocketed in the last few years), only full members and national bodies can participate on formal votes, but every body participates in informal votes.

The biggest issue is that the language is first defined by a standard and there are multiple implementations, which makes it much harder to iterate, and, as it is hard to get any moderately complex proposal through, often you can see the lack of an overall unified design.

I do believe that c++ lacks a strong benevolent dictator figure. Stroustup abdicated his position probably too early.


Anyone can be part of ISO process, you just need to write a paper for your change and be willing to champion it, or find supporter to champion it for yourself.

https://isocpp.org/std/the-life-of-an-iso-proposal


Agree, the direction this is going is outdated. Complexity is expensive and erstwhile c++ programmers are now in in cto positions or are responsible for budgets and know this.

The c++ standards need to focus on adding to developer productivity by incorporating micropython to make it more accessible.


Sure, the language is not complex enough, let's bolt another language on top of it.

The spaceship operator is exactly an attempt to make the language more accessible, but of course is being dismissed here as anything c++.


Maybe it’s useful and I appreciate the effort going into this. But consider you have a legacy c++ system in a large bank. The argument “we can extend this with a subset of python” can win over “when do we rewrite this in java or nodejs” which are the real conversations that have been taking place over the last decade.


The terminology is a little confusing. By "strong" total ordering do they really mean "strict" total ordering? If not, what's the difference?


No. The terminology is different because it refers to a different concept. In C++20 "strong" vs "weak" refers to substitutability. When two elements x and y compare to strong_ordering::equal, then it is assumed that f(x) == f(y). This is not assumed if they are just weak_ordering::equivalent. Read section "A new ordering primitive: <=>" of the article for a longer explanation.


I did read that section... that's why I said it's confusing...

It seems to me they're using different terminology for concepts that aren't different in math? In math, "equality" (=) already means the same thing as substitutability. A non-substitutable equality is already called "equivalence" (≡).

So in math we already have these:

[1a] Non-strict partial order: a binary ordering that allows incomparability (≼, ≡, ≽, ?)

[1b] Strict partial order: like [1a], but irreflexive (≺, ≡, ≻, ?)

[2a] Non-strict weak order: like [1a], but with all elements comparable (≼, ≡, ≽)

[2b] Strict weak order: like [1b], but with all elements comparable (≺, ≡, ≻)

[3a] Non-strict total order: like [2a], where the equivalence is equality (≤, =, ≥)

[3b] Strict total order: like [2b], where the equivalence is equality (<, =, >)

If I didn't make a mistake above, then:

- C++'s "strong total ordering" seems to be what in math we call "strict total ordering"

- Their "weak total ordering" seems to be what in math we call "strict weak ordering"

- Partial order is the same thing for both

Am I wrong here? If yes, how? If not, why did they randomly invent their own terminology? I haven't seen their definitions used elsewhere.


Huh, this is... not the terminology I am used to, as a mathematician?

My experience is that what C++ is calling a "weak order", and what you are calling a "weak order", is what in math is called a "pre-order" or a "quasi-order".

Meanwhile, the distinction you are making between strict and nonstrict is just ignored (except by constructivists), since they're equivalent ways of talking about the same thing. Indeed I'm not sure why you included both because, well, they're just two ways of talking about the same thing (again, unless you're a constructivist).

Really (assuming classical logic) there's just 4 possibilities here:

1. Total order (what they're calling a "strong ordering") 2. Total preorder (what they're calling a "weak ordering") 3. Partial order (which they are also calling a "partial ordering") 4. Partial preorder (which they don't account for)


You're agreeing with me?

"Preorder" is just a (better-known?) synonym for "non-strict weak order" [1] [2], so we don't disagree there.

"Strict" vs. "non-strict" I merely included because C++ comparisons return strict orders. I thought it was worth including, but feel free to ignore it.

So just as you pointed out in your own list, and just as I've been saying, "strong total order" isn't the terminology (every total order is strong), and neither is "weak total order" (it means "total preorder" which is... just a total order). Like you said, they should say "total" order, "preorder" (or "weak" order as I said), "partial" order. The "strong" and "weak total" stuff is just something they seem to have invented in contradiction with the established mathematical terminology for... no reason/gain?

[1] https://en.wikipedia.org/wiki/Weak_ordering#Total_preorders

[2] http://fitelson.org/roberts_measurement_theory.pdf#page=56


Well, I disagree that the article is confusing on this point. They say quite explicitly what they mean. I agree that the terminology is annoying, because they should just match the existing math terminology, and it's also annoying that they didn't account for partial preorders.

I'm not sure whether I agree that the terminology is confusing; on the whole I think it isn't. The reason it's not confusing is that it sufficiently different from usual math terminology so as not to interfere -- i.e., the terminology doesn't actually disagree at any point, it's not incompatible. Like, concepts get reinvented all the time and you just kind of have to get used to things having multiple terms, and be ready to translate unusual terminology into standard terminology, even as of course you should do what you can to reduce this happening. As long as you don't end up in a situation where one word means two different things, there's not really confusion, just different terminology.

And, as I said above, I definitely disagree that the article is confusing on this point, because they're very explicit about what they mean.


I said the terminology is confusing, not the article...

> the terminology doesn't actually disagree at any point, it's not incompatible

It does disagree though? I thought I just explained this... I'll do it again. "Total orders" are by their very definition in math always 'strong' -- their entire point is to use equality as the equivalence relation. That's what distinguishes them from weak orders, which can have other equivalence classes. So a "weak total order" makes no sense -- if a weak order is total, it's by definition the same as a 'strong' total order. That's in direct contradiction with their terminology.


I mean, that's like saying "A ring by definition is always associative, so saying 'a ring without associativity' is contradictory". Neither ordinary language nor technical terminology work that way; not every phrase is compositional in such a straightforward way, including in mathematics. I mean, a "fake X" is not an X.

In mathematics the word "weak" plays a similar role; e.g. in differential equations a weak solution need not be a solution; it's easy to find more examples. Similarly when one talks about a "non-Y X" or a "X without Y" where ordinarily Y is part of the definition of X. The result is not an X, but it's still (usually) clear what's meant, and these are still the terms that get used. Anyway, the point is, new terms someone is defining have to be considered on the whole.

An actual incompatibility, like I said, would be giving an existing phrase a new meaning. E.g. if they had referred to total preorders as partial orders, that would be an incompatibility.


> An actual incompatibility, like I said, would be giving an existing phrase a new meaning. E.g. if they had referred to total preorders as partial orders, that would be an incompatibility.

This is what they're doing!! They're calling pre-orders "weak total orders". That's a direct incompatibility... "weak total order" already means "total order", just like "total weak order" already means "total pre-order" which already means "total order". "Weak" and "total" are both adjectives, and they're interchangeable. If you asked anyone (who unlike you had already heard of the term "weak order" before) that's exactly how they would interpret it. Nothing in it would be left open for interpretation since everything is defined.

Whereas in your solution example, unlike here, the phrase "weak solution" didn't already mean something else, and there was no existing term for the concept either. And unlike in "ring without associativity", they use the word "total" in "weak total order" to mean nothing. They could've taken it out and "weak order" would've meant exactly what they meant.

If you want to make a comparison, what they're doing would be like using "a nonnegative positive solution" to mean "a nonnegative solution" (rather than "a positive nonnegative solution"). Or using "an fractional real number" to mean "a fractional complex number" rather than a "real fractional number" (i.e. real fraction). Which would be completely nuts.


> This is what they're doing!! They're calling pre-orders "weak total orders".

No. Giving one thing two names is not an incompatibility. Giving two things one name is an incompatibility.

> "weak total order" already means "total order"

Does it? I've never heard it called that. I think anyone on hearing the term "weak total order" would reasonably infer it refers to some notion than a total order.

> "Weak" and "total" are both adjectives, and they're interchangeable.

No. This is completely wrong. Terminology is very frequently not compositional in such a simple way.

I mean, this works perfectly fine with words that convey additional conditions. It does not work with words that convey a removal of conditions, which is what the word "weak" does!

Like, in the phrase "weak order" (meaning total preorder), the word "weak" is not imposing the conditions of reflexivity, transitivity, etc. It is starting from a baseline of "order" meaning "total order", and then weakening this to a preorder. The word "weak" only weakens!

Anyone, on seeing the phrase "weak total order", assuming they know the general usage of the word "weak", can reasonably infer that it refers to something weaker than a total order. (Likely a preorder.)

Yes, this makes "weak total order" something of a ridiculous phrase, given existing terminology. But again, remember that "weak order" is starting from a baseline where "order" means "total order"; in a sense, "weak order" is really something of an abbreviation for "weak total order".

(...of course, all of this highlights the odd way that terminology for orders themselves work. After all, "partial" is also a weakener. I haven't studied the history, but I'd bet you that "order" originally meant "total order" (it's still often used that way), then "partial order" came later as a weakener, then the meaning of "order" shifted as the study of partial orders became more common, then "total order" was coined as a retronym.)

> If you want to make a comparison, what they're doing would be like using "a nonnegative positive solution" to mean "a nonnegative solution" (rather than "a positive nonnegative solution"). Or using "an fractional real number" to mean "a fractional complex number" rather than a "real fractional number" (i.e. real fraction). Which would be completely nuts.

All of these are examples where you're using two terms that both impose conditions, rather than removing them. The word "weak" simply does not work that way.


> All of these are examples where you're using two terms that both impose conditions, rather than removing them.

All of these? Really? "Fractional" and "real" are 'imposing' conditions on 'number' rather than removing them? Like you said, I'd bet you that 'number' first meant 'natural number' rather than 'complex number'.

And before you tell me how deletion is also commutative just like addition—I could debate you the merits of that in natural language too, but that's beside my point in the last paragraph.

> Does it? I've never heard it called that.

Yes? That's precisely why I just gave you 2 links to people calling it exactly that above!! https://en.wikipedia.org/wiki/Weak_ordering#Total_preorders http://fitelson.org/roberts_measurement_theory.pdf#page=56

And I already asked you how you think a mathematician who unlike you knows 'weak' already has a definition this context would interpret 'weak total order' to mean? and you just ignored me there too. Just like you ignored my links way above where I showed you this is known terminology you're merely not aware of. I don't know why you're not cooperating but I'm tired of continuing.


Do not try too hard to compare with maths here. A programming language is different in many things from how maths is usually formalized. In C++20 there is no "strong total ordering", "weak total ordering" nor their partial counterparts. There are just "strong ordering", "weak ordering" and "partial ordering".

In a strong ordering, two objects can only by one smaller, equal or larger than the other. If they are equal, it means that they are substitutable.

In a weak ordering, two objects can only be one smaller, equivalent or larger than the other. No substitutability is implied.

In a partial ordering, two objects can be one smaller, equivalent or larger than the other, or just not comparable. Again, no substitutability is implied.

There is no point in distinguishing strict vs non-strict: depending on whether you call < or <= you will get the string or not string variant, and the same for > and >=.


A brand new language without backward compatibility would be nice. C+++


Rust and D are pretty close to just that IMO.

BTW the new language standards do make backwards incompatible changes. C++17, for example, deletes several features which were deprecated.

And [1] is an egregious example.

[1] https://stackoverflow.com/a/620402/489590


The whole point is to have backwards compatibility. It's just fascinating how they add or fix things without breaking the code.

Doing without backwards compatibility is easy: take Rust or D.


> It's just fascinating how they add or fix things without breaking the code.

In the same way it is fascinating to watch a forest fire from afar.


This hostility is tiring.

The ability to avoid rewriting is a practical thing.


Just in case you didn't know, the "adding +" part of your idea is what inspired the naming of C#. If you look at the #, it is supposed to represent two columns of two + each.


Nice! I never thought about this. I believed part of the reason they chose "C#" is because of musical terminology: "C#" means C-sharp, which is on a higher pitch than a plain "C".


You are correct I think, but the play on double plus was also considered.


How do the literal 0 comparisons fit into the language? Are they special purpose syntax or is it possible to write your own operator that only typechecks if the RHS is a literal 0?


Are you confusing the implementation of these operator overloads with their signature? The spaceship returns {neg, 0, pos} and that's why the implementations are like that.


I’m referring to this bit of the article:

The values of these comparison categories can be compared against the literal 0 (not any int, not an int whose value is 0... just the literal) using any of the six comparison operators...


Oh, this is particularly confusing, I agree.

IIUC it's referring to how you can safely evaluate the return value from a spaceship: by comparing against literal zero. (not how you would overload a comparison with literal zero).


C and C++ already have such a thing, IIUC. The literal 0 is considered the same as nullptr, if compared to another pointer.

It is a bit worrying that such an overloaded meaning is added again to the language.


I really hope a modern language can take over many features of C++ without its added complexity.

Potential candidates: Rust, Julia, Swift.

All three are modern languages with modern concepts. I have high hopes especially for Julia.


Swift is only a candidate in the context of Apple OSes, and yes Apple does position it as such on the Swift documentation.

Outside Apple platforms, maybe Rust, but it still found lacking in tooling and libraries versus what C++ offers, specially in graphics programming, GPGPU, embedded platforms, and IDEs.

Maybe now that Microsoft is looking into it, we might get Visual Rust of some sort.

Julia is more a replacement for Python than anything else.

Then if having a GC around is not a problem for the task being solved, there are plenty of alternatives out there, Java, C#, F#, OCaml, Haskell, Lisp, Scheme, ....


Please excuse my lack of knowledge, but is Julia gen-purp enough to replace C++? Can I write a hypothetical kernel with it, or a game?


Game? Sure, if you’re willing to spend a non trivial amount of time integrating with a graphics API, but that’s true of most non C++ languages. Kernel? Probably not, AFAIK there’s no subset of Julia that can run without an OS (like for rust and C++).


To me it seems obvious that you can write any application using any Turin-complete language. So the question seems more like "how convenient would it be to write a kernel or a game using Julia and with which performance impact?"

For games, it will highly depends what kind of game, of course. Assuming cutting-edge 3d-graphic intensive video-game, it will be probably not be your language of choice, especially regarding available out of the box libraries. For implementing Tetris, I guess you make it in any language for sufficient performance. :)


In case someone rushed to cppreference.com to compare, I think that https://en.cppreference.com/w/cpp/utility/compare/strong_ord... is slightly wrong, in that it should not list "equivalent" as an alternative. I reported that (putative) mistake.


An early c++20 fan gives a tribute to his favorite new operator [1] (indeed so early it was only drafts back then).

[1] https://youtu.be/ZA6ehndc6co


This is interesting... As non-C++ developer, I've never thought I needed this. Is this an answer to a question anyone is asking?


C++ just has this under-the-hood access to be able to define any comparison operator to do arbitrary things.

C++20 is now more aware that this things are used for comparison, and it helps to make definitions shorter.

Still handles non-totally-ordered cases fine.


Perl has had this for decades. C’s strcmp() has similar semantics.

I thought the motivation was that sort can be more efficient when each comparison can return more information (the standard library already has a sort function that relies only on using less-than operations).

But people eventually realized that certain operations have to be implemented together to make sense. For instance, the sort function that relies only on less-than operations defines two values as equal if neither value is less than the other. Many people consider this weird, and wonder why sort doesn’t require both less-than and equality operations. The spaceship operator essentially implements a group of operations, and guarantees they’ll be consistent: less-than, equal, and greater-than (plus not-equal, less-than-or-equal, and greater-than-or-equal). Personally, I’m not thrilled about a special purpose solution to a single example of a problem (use this one operator to implement this group of related operations), but I do see some value in making it easy to implement this group of related operations.


Yes. Aside from reducing repetitive code (replacing 7 overloaded comparison operators) it's probably more efficient. std::map, for example, currently uses a less-than comparator to order items, requiring multiple comparisons to check if items are equal.


As someone who has written comparison functions in C++ (probably incorrectly), yes, this would have been a godsend.

Similarly, recent-ish innovations like std::move and std::unique_ptr may look complicated at first, but they genuinely made life a lot easier.


Yes, your favorite high level language might provide something like this implicitly.


The part you probably needed is 3-way comparisons. The weak vs. strong vs. distinction is less likely to be needed IMO.


The default operators are very cool, but I assume that many codebases would ban them.

Imagine, for example, an engineer decided to reorder members in a struct to make it pack better. Now the semantics of default <=> have changed!

Also, as a minor nit, the optional number sample has a possible bug. I would assume that two null optionals would compare equivalently. Or is that part of how <=> should work? Should NaN <=> NaN == partial_ordering::unordered?


I don't have the IEEE 754 specification in front of me to verify, but the usual descriptions of floating point comparison considers it to be a 4-value (<, =, >, unordered), with 16 different operators defining for which of those values the comparison returns true. So an ordered equals returns true only when the result is =, but an unordered equals returns true if the result is = or unordered. The unordered operation is returned if either operand is unordered, which makes the comparison non-reflexive.

Note that floating point is already weird, because most languages make == reflect the ordered equals, not the unordered equals, and so the primitive equality for floats violates the reflexive principle (i.e., x == x returns false).


Yes, the standard and expected behavior is that logical_op(NaN, anything) == false [except inequality] because NaN is not ordered with, or equal to, anything, including another NaN.

Otherwise you may have the strange behavior that two erroneous values are equal, e.g., 0*inf == tan(inf).


I reckon that having a single operator would make it easier to pass through a custom comparer to whatever function you need. That is probably a better option than changing the default <=> of a type.

For what it's worth the mathematical definition of partial orders suggests having NaN equivalent to NaN. You're not really forced to but giving up reflexivity doesn't seem worth it (unless you want to view each NaN a it's own object).

Interestingly mathematics only defines <= for a specific kind of object. Although on it's own this doesn't give you much information about what kind of relation it is.


While I don't know about this specific case, I do know in some languages, like JavaScript, NaN poisons any operation it is part of. So any operation on it is NaN, and comparisons to it are false.


This is default IEEE 754 behavior.

https://en.wikipedia.org/wiki/NaN


God this is so confusing. Why the $%&* does it take a 17-page 27-minute blog post to explain how to implement comparison operators?

Can someone please write the 3-minute blog post that is easy to understand and follow?

If the answer is "it's too complicated to explain in 3 minutes" then I think that is a telling sign.


Here you go: less than 3 minutes to type out!

"The TL;DR is that there's a new operator, <=>, that returns less than 0, 0, or greater than zero, just like, say, strcmp does, but for all sorts of C++ objects. You can define operator<=> and operator== and your compiler will take care of the various other comparators. Doing this speeds up some library functions, doesn't force you to write as much code, and makes comparisons more intuitive.

"There are also a bunch of corner case optimizations but if you don't care about them you don't need to know"

This was written by someone who cares about the corner cases and underlying theory. It's like FP: IEEE743 is full of weirdo cases that some really smart people spent a lot of time worrying about. All most users care about is that "they have a fractional representation and that == doesn't reliably work except against 0.0." Just because there's hundreds of pages in that spec doesn't make it dumb either.


I don't think this is right. In order to implement <=> for a class, you have to understand strong_ordering, weak_ordering, and partial_ordering in order to pick one, right?


Generally, no. Most of the time you're going to write these functions as defaulted with an auto return type:

    auto operator<=>(const T&) const = default;
    bool operator==(const T&) const = default;
The compiler will implement the function by comparing all the class member variables and will pick the type of the ordering as appropriate. You only need to look into further details about ordering types if you want to do something fancy.


A new string type! How exciting!


the article doesn't describe a new string type anywhere, only a hypothetical one for the purposes of demonstrating the new operator


I didn't see anything about a new string type. This does greatly simplify writing comparison operators and makes the compiler smarter about != Is actually a composition of ! (==). The spaceship operator is going to be a huge change.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: