#include <print> int main() { std::print("hello, world!\n"); return 0; } So, it ...

JackSlateur · on July 11, 2023

So:

  int main() {
      printf("hello, world!\n");
      return 0;
    }

C knew it all along :)

qalmakka · on July 11, 2023

... Except the fact that printf/scanf use variadics, and the only reason why it stopped being a constant source of crash is the fact that compilers started recognizing it and validating format strings/complaining when you pass a non-literal string as a format.

<format> is instead 100% typesafe. If you pass the wrong stuff it won't compile, and as {fmt} shows you can even validate formats at compile time using just `constexpr` and no compiler support.

jstimpfle · on July 11, 2023

As always, people making more fuss around it than necessary. Code calling printf() with a constant format string literal is this class of code that you have to run a single time to know it works. Many C++ programmers have always been using printf() in preference to e.g. std::cout because of ergonomics. And they were right.

It's hard to take people seriously that try to talk it down for being a pragmatic solution that's been around for probably 30-40 years.

mort96 · on July 11, 2023

I've definitely written bugs where the format specifier is wrong. Maybe you used %lu because a type happened to be unsigned long on your system, then you compile it for another system and it's subtly wrong because the type you're printing is typedef'd to unsigned long long there. Printing an uint32_t? I hope you're using the PRIu32 macro, otherwise that's also a bug.

qalmakka · on July 12, 2023

This. I have corrected uncountable lines of code where people just used "%d" for everything, potentially opening a massive can of worms. `inttypes.h` is what you should be using 99% of the time, but that makes for very ugly format strings, so basically nobody uses that. Otherwise you should cast all of your integer params to (long long int) and use %lld, which sucks.

jstimpfle · on July 11, 2023

Yes, this is annoying. Integer promotions can be annoying in general.

I'm often working with fixed size types, and still find myself using %d and %ld instead of PRIu32 etc most of the time, because it's simply easier to type. If unsure about portability issues it can help to cast the argument instead. But realistically it isn't much of an issue, printfs seem to be about 0.5% of my code, and >> 90% of them are purely diagnostic. I don't even recall the last time I had an issue there. I remember the compiler warning me a few times about mismatching integer sizes though.

throwaway2037 · on July 12, 2023

I agree. Layers of legacy typedefs in the Win32 API always catch me off guard. Any large source base with lots of typedefs, it can be tricky to printf() without issue.

usefulcat · on July 12, 2023

> you have to run a single time to know it works

A common use case for cstdio/iostreams/std::format is logging. It's not at all uncommon to have many, many log statements that are rarely executed because they're mainly for debugging and therefore disabled most of the time. There you go, lots of rarely used and even more rarely 'tested' formatting constructs.

I don't want things to start blowing up just because I enabled some logging, so I'm going to do what I can to let the compiler find as many problems as possible at compile time.

tialaramex · on July 11, 2023

> Code calling printf() with a constant format string literal is this class of code that you have to run a single time to know it works

    void msg(char *msg, char *x, int k) {
        printf("%hhn%s%d\n", x, msg, k);
    }

So, how about it? I mean, I have code where that works exactly as expected, so I can "know it works" according to you, but I also have code where that blows up immediately, because it's complete nonsense. Which according to you shouldn't happen, but there it is.

jstimpfle · on July 11, 2023

I mean yes, I should have been more restrictive in my statement, but I'm sure you notice how we're veering more into the hypothetical / into programming language geek land. I had to look up %hhn because I've never used it.

(Have used %n years ago but noticed it's a fancy and unergonomic way to code anyway. In the few locations where printed characters have to be counted, just consider the return value of the format call!)

And btw. how is this a problem related to missing type checks with varargs? The only problem I see is that we don't know that those pointers are not null / the char-pointer doesn't point to a zero-terminated string. In other words, just the basic C level of type (un)safety.

xeyownt · on July 12, 2023

Totally agree, given example is total nonsense.

Most issues with printf could be reported by static analysis, and modern compiler reports them as warnings, which in my book must be immediately converted to errors. All other weird usages should be either banned, or reviewed carefully, but they are very rare.

Also, std::iostream is ridden with bugs as well. Try to print out hex / dec in a consistent way is plain "impossible". Everytime you print an int, you should in fact systematically specify the padding, alignment and mode (hex/dec) otherwise you can't know for sure what you are outputting.

qalmakka · on July 12, 2023

iostream _sucks_, I had to implement an iostream for zlib + DEFLATE in order to play ball with an iostream-based library, and I had to sweat blood and tears in order to make it work right when a simple loop and a bit of templated code would have worked wonders compared to that sheer insanity of `gptr`, `pubsync`, ... The moment you notice that they've methods called "pubXXX" that call a protected "XXX" on the `basic_streambuf` class is the moment your soul leaves your body.

IOStreams is superbad, and thankfully <format> removes half of its usages which were based on spamming `stringstream`s everywhere (stringstream is also very, very bad). They also inspired Java's InputStream nonsense which ruined the existence of an innumerable number of developers in the last 30 years.

throwaway2037 · on July 12, 2023

This is an interesting post. Can you explain the two different scenarios with examples?

repsilat · on July 11, 2023

Sometimes the print statement is in untested sanity-checking error-case branches that don't have test coverage ("json parsing failed" or whatever). It's pretty annoying when those things throw, and not too uncommon.

Another case in C++ is if the value is templated. You don't always get test coverage for all the types, and a compile error is nice if the thing can't be handled.

"Type coverage" is pretty useful. Not a huge deal here I agree, but nice nonetheless.

chlorion · on July 12, 2023

printf and similar stdio string formatting functions have caused a non-trivial amount of security exploits.

If the compiler can statically check that your format string is valid and type checks, you can prevent this entire class of exploits.

https://en.wikipedia.org/wiki/Printf#Vulnerabilities

catskul2 · on July 11, 2023

> is this class of code that you have to run a single time to know it works

Not sure what you're trying to say here.

jstimpfle · on July 11, 2023

Take as an example printf("%d\n", foo->x);. Assuming it compiled but assuming no further context, what could break here at run-time? foo could be NULL. And the type of foo->x could be not an integer.

Let's assume you run the code once and observe that it works. What can you conclude? 1) foo was not NULL at least one time. Unfortunately, we don't know about all the other times. 2) foo->x is indeed an integer and the printf() format is always going to be fine -- it matches the arguments correctly. It's a bit like a delayed type check.

A lot of code is like that. Furthermore, a lot of that code -- if the structure is good -- will already have been tested after the program has been started up. Or it can be easily tested during development by running it just once.

I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.

I'll even go as far as saying that it's easy to have errors slip on refactors if there aren't good tests in place. But people are writing untyped Python or Javascript programs, sometimes significant ones. Writing in those is like every line was a printf()!

But many people will through great troubles to achieve an abstract goal of type safety, accepting pessimisations on other axes even when it is ultimately a bad tradeoff. People also like to bring up issues like this on HN like it's the end of the world, when it's not nearly as big of an issue most of the time.

Another similar example like that are void pointers as callback context. It is possible to get it wrong, it absolutely happens. But from a pragmatic and ergonomic standpoint I still prefer them to e.g. abstract classes in a lot of cases due to being a good tradeoff when taking all axes into account.

tialaramex · on July 11, 2023

> I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.

A modern compile time type checked formatter would have prevented this mistake, you are deliberately choosing to use poor tools and calling this "pragmatism" because it sounds better than admitting you're bad at this and you don't even want to improve.

In fact C++ even shipped a pair of functions here. There's a compile time type checked formatter std::format, which almost everybody should use almost always (and which is what std::println calls), and there's also a runtime type checked formatter std::vformat, for those few cases where you absolutely can't know the format string until the last moment. That is a hell of a thing, if you need one of those I have to admit nobody else has one today with equal ergonomics.

jstimpfle · on July 11, 2023

Thanks for the ad hominem, but let's put that into perspective.

My current project is a GUI prototype based on plain Win32/Direct3D/Direct2D/DirectWrite. It currently clocks in at just under 6 KLOC. These are all the format calls in there (used git grep):

        fatal_f("Failed to CreateBuffer(): %lx", err);
        fatal_f("Failed to Map() buffer");
        fatal_f("Failed to compile shader!");
        fatal_f("Failed to CreateBuffer(): %lx", err);
        fatal_f("Failed to create blend state");
        fatal_f("OOM");
        fatal_f("Failed to register Window class");
        fatal_f("Failed to CreateWindow()");
        fatal_f("%s failed: error code %lx", what, hr);
        msg_f("Shader compile error messages: %s", errors->GetBufferPointer());
        msg_f("Failed to compile shader but there are no error messages. "
        msg_f("HELLO: %d times clicked", count);
        msg_f("Click %s", item->name.buf);
        msg_f("Init text controller %p", this);
        msg_f("DELETE");
        msg_f("Refcount is now %d", m_refcount);
        msg_f("Refcount is now %d", m_refcount);
        vfprintf(stderr, fmt, ap);
        fprintf(stderr, "\n");
        fprintf(stderr, "FATAL ERROR: ");
        vfprintf(stderr, fmt, ap);
        fprintf(stderr, "\n");
        snprintf(utext, sizeof utext, "Hello %d", ui->update_count);
        snprintf(filepath, sizeof filepath, "%s%s",
        int r = vsnprintf(m_buffer, sizeof m_buffer, fmt, ap);
        int r = vsnprintf(text_store, sizeof text_store, fmt, ap);
        snprintf(svg_filepath, sizeof svg_filepath, "%s", filepath);

That's theory and practice for you. The real world is a bit more nuanced.

Meanwhile I have 100 other, more significant problems to worry about than printf type safety. For example, how to get rid of the RAII based refcounting that I introduced but it wasn't exactly an improvement to my architecture.

But thanks for the suggestion to use std::format in that set of cases and std::vformat in these other situations. I'll put those on my stack of C++ features to work through when I have time for things like that. (Let's hope that when I get there, those aren't already superseded by something safer).

(Update: uh, std::format returns std::string. Won't use.)

qalmakka · on July 12, 2023

> std::format returns std::string. Won't use

just use `std::format_to_t` then and format to whatever your heart desires, without ever allocating once:

    std::array<char, 256> buf{};

    std::format_to_n(data(buf), size(buf), "hello {}", 44);

I've used `fmt` on *embedded* devices and it was never a performance issue, not even once (it's even arguably _faster_ than printf).

(OT: technically speaking, in C++ you shouldn't call `vfprintf` or other C library functions without prefixing them with `std::`, but that's a crusade I'm bound to lose - albeit `import std` will help a lot)

jstimpfle · on July 13, 2023

I noticed std::format and std::print aren't even available with my pretty up-to-date compilers (testing Debian bookworm gcc/clang right now). There is only https://github.com/fmtlib/fmt but it doesn't seem prepackaged for me. Have you actually used std::format_to_n? Did you go through the trouble of downloading it or are you using C++ package managers?

I'm often getting the impression that these "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.

But I'm asking in earnest. Please also check out my benchmark in the sibling thread where I compared stringstream with stdio/snprintf build performance. Would in fact love to compare std::format_to_n, but can't be arsed to put in more time to get it running right now.

qalmakka · on July 14, 2023

> my pretty up-to-date compilers > testing Debian bookworm

Debian and up to date compilers - pick one. <format> support comes with GCC 13.x, which has been released more than 3 months ago. MSVC has had it for years now, LLVM is still working on it AFAIK (but it works with GCC). `std::print` is a new addition in C++23, which hasn't been released yet.

> Did you go through the trouble of downloading it or are you using C++ package managers? I don't know of many non-trivial programs in C or C++ that don't rely on third party libraries. The C standard library, in particular, is fairly limited and doesn't come with "batteries included".'

In general I've been using {fmt} for the better part of the last 5 years, and it's trivial to embed in a project (it uses CMake, so it's as simple as adding a single line in a CMakeLists.txt). It has been shipped by most distributions for years now (see https://packages.debian.org/buster/libfmt-dev), for instance, it was already supported in Debian buster, so you can just install it using your package manager and that's it.

{fmt} is also mostly implemented in its header, with a very small shared/static library that goes alongside it. It's one repository I always use in my C++ projects, together with Microsoft's GSL (for `not_null` and `finally`, mostly).

> "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.

No, I think that insecure code is insecure, period, no matter how much it is used or well known. Such focus on practicality over correctness was the reason why my C university professor was so set on continuing using old C string functions which were already well known back then to be a major cause of CVEs. That was, in my opinion, completely wrong.

This is especially true in this case, {fmt}/<format> are nicer to use than `sprintf`, are safer, support custom types and are also _faster_ because they are actually dispatched and verified at compile time. Heck the standard itself basically just merged a subset of {fmt}'s functionality, so much so that I've recently sed-replaced 'fmt' with 'std' in some projects and it built the same with GCC's implementation. `std::print`, too, is just `fmt::print`, no more no less (with a few niceties removed, afaik).

> where I compared stringstream with stdio/snprintf build performance

String Streams (and IOStream in general) are a poorly designed concept, which have been the butt of the joke for years for their terrible performance. This is well known, and I'm honestly aghast any time I see anyone using them in place of {fmt}, which has been the de-facto string format library for C++ for the best part of the last decade (at least since 2018) and is better than `std::stringstream` in every conceivable way.

If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

jstimpfle · on July 14, 2023

Ok thanks for answering. In the meantime I had already tested <format> on MSVC as well as godbolt: https://news.ycombinator.com/item?id=36715949

Not practical for me to add a second of compile time for each file that wants to print something.

> Debian and up to date compilers - pick one.

gcc 12.3 was released only a few months ago and is included. gcc 13.1, some 80 days old, doesn't seem to have made it. Not everybody is closely tracking upstream. Immediately jumping on each new train is not my thing (hence why Debian is fine), nor is it how software development is handled in the industry generally.

Even on godbolt / gcc 13.1 which I linked in the other post, <format> isn't available. Only {fmt} is available as an optional library.

> {fmrt}/<format> are nicer to use than

I think otherwise, but maybe you enjoy brewing coffee on top of your desktop computer while waiting for the build to finish.

> _faster_ because they are actually dispatched at compile time

I don't actually want this unless I'm bottlenecked by the format string parsing. If I have one or two integer formats in my formatting string, the whole thing will already be bottlenecked by that. So "dispatching at compile time" is typically akin to minimizing the size of a truck, when we should have designed a sports car. The thing about format strings and varags is they're in fact an efficient encoding of what you want to do. Not worth emitting code for 2-5 function calls if a single one is enough.

If there is a speed problem, you need some wider optimization that the compiler can't help you with.

Apart from that, that compile time dispatching doesn't actually happen with fmtlib in the godbolt, not even at -O2. The format string is copied verbatim into the source. Which I like.

> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?

jstimpfle · on July 14, 2023

>> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

> Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?

Duh, I apologize for not even reading your statement completely. So I went on this page and it is exactly how I imagined. libc printf 0.91s, libfmt 0.74s. 20% speedup is not nothing, but won't help when there is an actual bottleneck. (In this case the general approach has to be changed).

Also compiled size is measurably larger even only with a few print statements in the binary. Compile time is f***** 8 times slower!

These are all numbers coming from their own page -- expect to have slightly different numbers for your own use cases.

throwaway2037 · on July 12, 2023

    (Update: uh, std::format returns std::string. Won't use.)

What type would you prefer?

jstimpfle · on July 12, 2023

Generally I use printf/fprintf and snprintf. I would be looking for a replacement for snprintf for the most part, i.e. I want to provide the buffer.

Apparently we can

    template< class... Args >
    std::print(std::ostream, std::format_string<Args...> fmt, Args&&... args );

(sigh of despair) since C++23, so that could be an option if there is an easy way to make a std::ostream of a memory slice. But I'm not on C++23 yet.

Compare to

    int snprintf ( char * s, size_t n, const char * format, ... );

(sigh of relief) and I think it takes a special kind of person to think that std::print is strictly better.

throwaway2037 · on July 13, 2023

For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.

It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream. If you don't like std::string, you can probably write your own ostream that will operate on a fixed size char buffer. (Can any C++ experts comments on that idea?)

About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?

jstimpfle · on July 13, 2023

> It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream.

I don't know about those specifically right now, but in general these things have huge compile time costs and are also generally less ergonomic IMO. [EDIT: cobbled together a working version and added it to my test below, see Version 0].

> About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?

Yes. It's a mouthful, and I'm worried not only about static checks but about other things too -- like readability of errors, include & link dependencies, compile performance, amount of compiled code (which is minimal in case of snprintf/varargs)... I would need to check out std::format_to_n() as suggested by the sibling commenter.

And hey -- snprintf has been available for easily 30+ years ... while the <print> and <format> headers that people make such a fuss about, don't even seem available on gcc nor clang on my fully updated debian bookworm system. The reason is that those implementations aren't complete, even though <format> is C++20. The recommended way to get those headers is to grab https://github.com/fmtlib/fmt as an external library... Talk about the level of hype and lack of pragmatism that's going on around here. People are accusing each other for not using a standard library that isn't even implemented in compilers... And with a likelyhood they haven't used the external library themselves, and given that this library is external it's not heavily tested and probably contains bugs still, maybe CRASHES and SECURITY EXPLOITS.

But let me test C++ features that actually exist:

   #if VERSION == 0

   #include <iostream>
   #include <streambuf>

   struct membuf: std::streambuf
   {
      membuf(char *p, size_t size)
      {
         setp(p, p + size);
      }
      size_t written() { return pptr() - pbase(); }
   };

   int main()
   {
      char buffer[256];
      membuf sbuf(buffer, sizeof buffer);

      std::ostream out(&sbuf);

      out << "Hello " << 42 << "\n";

      fwrite(buffer, 1, sbuf.written(), stdout);

      return 0;
   }


   #elif VERSION == 1
   #include <sstream>
   #include <iostream>

   void test(std::stringstream& os)
   {
       os << "Hello " << 42 << "\n";
   }

   int main()
   {
      std::stringstream os;
      test(os);
      std::cout << os.str();
      return 0;
   }

   #elif VERSION == 2

   #include <stdio.h>

   int test(char *buffer, int size)
   {
      int r = snprintf(buffer, size, "Hello %d\n", 42);
      return r;
   }

   int main()
   {
      char buffer[256];
      int len = test(buffer, sizeof buffer);
      fwrite(buffer, 1, len, stdout);
      return 0;
   }
   #endif

Compile & link:

                   CT      LT      TT      PT      PL

     -DVERSION=0   0.361s  0.095s  0.456s  0.081s  32573
     -DVERSION=1   0.364s  0.089s  0.453s  0.074s  32338 
     -DVERSION=2   0.039s  0.088s  0.127s  0.031s    918

CT=compile time, LT=link time, TT=total time (CT+LT), PT=preproc time (gcc -E), PL=preprocessor output lines

Bench script:

    # put -DVERSION=1 or -DVERSION=2 as cmdline arg
    time clang++ -c "$@" -Wall -o test.o test.cpp
    time clang++ -Wall -o test test.o
    time clang++ "$@" -Wall -E -o test.preprocessed.txt test.cpp
    wc -l test.preprocessed.txt

My clang version here is 14.0.6. I measured with g++ 12.2.0 as well and the results were similar (with only 50% of the link time for the snprintf-only version).

For such a trivial file, the difference is ABYSMAL. If we extrapolate to real programs we can assume the difference in build times to be 5-10x longer for a general change in programming style. Wait 10 seconds or wait 1 minute. For a small gain in safety, how much are you willing to lose? And how much do this lost time and resources actually translate to working less on the robustness of the program, leaving more security problems (as well as other problems) in there?

And talking about lost run time performance, that is real too if you're not very careful.

> For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.

Honestly I just don't ensure it perfectly -- beyond running them once as described. I write a lot of code that isn't fully proofed out from the beginning. Exploratory code. A few printfs are really not a concern in there, there are much bigger issues to work out.

I also absolutely do have some printfs that were quickly banged out but that are hidden in branches that have never actually run and might never happen -- they were meant for some condition that I'm not even sure is possible (this happens frequently when checking return values from complicated APIs for example).

The real "problem" isn't that there is a possibly wrong printf in that branch, but that the branch was never tested, and is likely to contain other, much worse bugs. But the fact that the branch was never run also means I don't care as much about it, pragmatically speaking. Likely there is an abort() or similar at the end of the branch anyway. It's always important to put things into perspective like that -- which is something that seems often missing from C++ and similar cultures.

The more proofed out some code gets, the more scrutiny it should undergo obviously.

Apart from that, compilers do check printfs, and I usually get a warning/error when I made a mistake. But I might not get one if I write my own formatting wrappers and am too lazy to explicitly enable the checking.

throwaway2037 · on July 14, 2023

This is a phenomenal reply. Thank you very much to teach me patiently. :)

qalmakka · on July 12, 2023

Again, you're mixing functions up. `std::print` is the equivalent to "std::fprintf", the one you want to write on random buffers is `std::format_to_n`, which IS a strictly better version of `snprintf`.

spkm · on July 11, 2023

Maybe if you used a typesafe language, you wouldn't have to track down 100 other more significant problems (:

jstimpfle · on July 12, 2023

I'm using C/C++, which do provide a good level of type safety.

And no, types are absolutely not my problem. In fact, rigid type systems are a frequent source of practical problems, and they often shift the game to one where you're solving type puzzles -- instead of working on the actual functionality.

KerrAvon · on July 12, 2023

Come on, dude. C provides almost no type safety whatsoever. There’s no point in saying “C/C++” here, because you won’t adopt the C++ solutions for making your code actually typesafe.

jstimpfle · on July 12, 2023

Type safety is not a standardized term, and it is not binary. Being black and white about things is almost always bad. One needs to weigh and balance a large number of different concerns.

A lot of "modern C++" is terrible, terrible code precisely because of failing to find a balance.

Many C++ "solutions" are broken by design.

gpderetta · on July 11, 2023

Every time I have to sprintf a string_view I die a little bit inside.

pjmlp · on July 12, 2023

There are several CVEs that thank their success story to printf's format string.

gmadsen · on July 12, 2023

for basic things, sure. it is much much worse than this when you deal with different encodings for an application that needs to format and print things

Dylan16807 · on July 11, 2023

Printf is fine until you have to deal with the "f".

Some compilers can keep format strings type-safe, but they are going above and beyond the standard to make it happen.

kazinator · on July 11, 2023

> above and beyond the standard

That's a very low bar though.

There are widely used languages that don't have a standard at all; literally every single thing you can do in them is above and beyond any sort of standard.

Dylan16807 · on July 12, 2023

Few of those languages corrupt the stack when you print wrong.

tristan957 · on July 11, 2023

If you are using a compiler that doesn't support this, developer tooling like clangd and clang-tidy can help catch these issues too.

Dylan16807 · on July 11, 2023

Today, yes. But it's relevant for a claim that C had it right "all along".

LordShredda · on July 11, 2023

Maybe this should've been something built in if automated tooling can catch it?

Sniffnoy · on July 11, 2023

I mean, the function for straight printing is puts; I don't know why people keep using the much more complicated printf in cases where no formatting is involved.

Edit: OK, I guess puts includes a newline, so you'd need to use fputs if you don't want that (although this example includes one). Still, both of those are much less complicated than printf!

izoow · on July 11, 2023

Consistency. Having intermixed puts and printfs throughout the code looks pretty bad. Also, every compiler replaces printf of a literal ending with \n with a puts anyway.

orra · on July 12, 2023

> int main() {

To be fair to C++, this is undefined behaviour in C until C23. Prior to that the () is interpreted as (...), varargs, and not (void).

So, some mutually beneficial cross pollination of ideas.

coldpie · on July 11, 2023

Operator overloading is an anti-feature and every language with it, would be better without it. Fight me.

constantcrying · on July 11, 2023

It is a very natural feature. Especially when you are writing mathematical code e.g. implementing different types of numbers, e.g. automatic differentiation, interval arithmeic, big ints, etc.

Overloading gives user defined types the expressiveness of internal types. Like all features, of they are used badly (e.g when + is overloaded to an operation which can hardly be interpreted as addition) it makes things worse. But you can write bad code in any language, using any methodology.

CobrastanJorji · on July 11, 2023

It is a very natural feature, but it makes discovering what you can and can't do with a library really hard. Learning what is and isn't legal with math libraries that use a lot of them can be really tricky. For example, numpy code is really easy to read, which is fantastic, but figuring out how you're intended to do things from the documentation alone is quite difficult.

constantcrying · on July 11, 2023

In my experience numpy has also been on of the worst numerics libraries to deal with. The main reason is that Python seems designed to be hostile to numerics. Loose typing, assumptive conversions, specific numeric types hard to access, tedious array notations, etc. all are bad preconditions for a language which sadly seems to have become the prototyping standard in that area.

The moment you have a language actually designed for numerics all these things vanish. One of Julias core design aspects is multiple dispatch, including operator overloading and it works extremely well there.

I also don't see the point for discoverability at all. The documentation will list the overloads and the non-overloaded calls are exactly as discoverable as the others.

spacechild1 · on July 11, 2023

Sigh. Do you really think that "(vec1.mul(vec2)).plus(vec3.mul(vec4))" is better than "vec1 * vec2 + vec3 * vec4"? Thanks, but no thanks.

bhouston · on July 11, 2023

As someone who has written math libraries over and over again for the last 25 years (I wish I was joking, but it turns out it is something I'm good at [1]), I find that operator overloading works only for the simple cases but that for performance and clarify, function names work best.

Function names let you clarify that it is an outside product or inside product (e.g. there are often different types of adds, multiplies, divides), and I can not stand when someone maps cross product onto ^ (because you can both exponent and cross product some vectors, like quaternions, so why use exponent operator for cross?) or dot product onto something else that doesn't make any sense. Also operator overloading often doesn't make clear memory management, rather it relies on making new objects constantly, whereas with explicit functions, you can pass in an additional parameter that will take the result. Lastly, explicit functions allow you to pass additional information like how to handle various conditions, like non-invertible, divide by zeros, etc.

I find word-based functions more verbose but significant less error prone and also they are more performant (because of the full control over new object creation.) Operator overloading is only good for very simple code and even then people always push it too far so that I cannot understand it.

[1] 1997: https://github.com/bhouston/BezierCurveDemo1997/blob/master/..., 2001: C# math library: https://github.com/bhouston/ExoEngine3D/tree/master/Librarie..., 2013: a bunch of the core Three.js math library, 2023: https://github.com/bhouston/threeify/tree/master/packages/ma...

spacechild1 · on July 11, 2023

> rather it relies on making new objects constantly, whereas with explicit functions, you can pass in an additional parameter that will take the result.

We have move semantics since C++11.

dryanau · on July 12, 2023

It's not the same if you need to allocate memory for the result. If you could pass the result in by reference, then you could (re)use a buffer which has already been allocated. The difference is massive in things like matrix calculations or image processing where you have an inner loop or a real-time stream repeating similar calculations.

bhouston · on July 12, 2023

Or you are working with a language like JavaScript where math primitives are GC objects and thus quite costly. In those languages if you do not reduce object creation via reuse in this way, it can be very slow.

gpderetta · on July 11, 2023

And expression templates even before that.

anon291 · on July 12, 2023

Perhaps you're arguing that you ought to be able to name new operators (like Haskell) so that you can create a new operator for inner product instead of having to use '^' (typically used for exp or xor).

Alternatively, the main reason to use operators here is infix notation, so perhaps Haskell-like backticks.

gmadsen · on July 12, 2023

I think languages like Julia make a strong case the other way. you can literally write algorithms that match the pseudocode in a paper. You have to be ok with unicode in your source file, but for numeric stuff, I think its a nice feature

coldpie · on July 11, 2023

Try this on:

    plus(
        mul(vec1, vec2),
        mul(vec3, vec4))

Yeah it's a tiny bit clumsier, and prefix notation takes some getting used to. But on the plus side we avoid all the too-clever travesties programmers have inflicted on us with bad operator overloading decisions! On the whole I think it's easily worth the trade.

spacechild1 · on July 11, 2023

Again, no thanks. I want mathematical notation and I simply won't use any language without operator overloading. Free functions for common mathematical operations are an abomination.

throwaway894345 · on July 11, 2023

Then you should probably use a language that lets you write DSLs for any given domain, rather than abusing operator overloading which just happens to work for a few subdomains of mathematics (e.g., you can't use mathematical conventions for dot product multiplication in C++). Anyway, I've never seen any bugs because someone misunderstood what a `mul()` function does, but I've definitely seen bugs because they didn't know that an operator was overloaded (spooky action at a distance vibes).

spacechild1 · on July 11, 2023

Actually, I'm quite happy what C++ has to offer :)

Yes, the * operator can be ambiguous in the context of classic vector math (although that is just a matter of documentation), but not so much with SIMD vectors, audio vectors, etc.

Again:

a) vec4 = (vec1 - vec2) * 0.5 + vec3 * 0.3;

or

b) vec4 = plus(mul(minus(vec1, vec2), 0.5), mul(vec3, 0.3));

Which one is more readable? That's pretty much the perfect use case for operator overloading.

spacechild1 · on July 11, 2023

Regarding the * operator, I think glm got it right: * is element-wise multiplication, making it consistent with the +,-,/ operators; dot-product and cross-product are done with dedicated free functions (glm::dot and glm::cross).

cuteboy19 · on July 12, 2023

That's what numpy does too.

_0w8t · on July 11, 2023

One never writes such expression in a serious code. Even with move semantic and lazy evaluation proxies it is hard to avoid unnecessary copies. Explicit temporaries make code mode readable and performant:

auto t = minus(vec1, vec2); mul_by(t, 0.5/0.3); add(t, vec3); mul_by(t, 0.3); v4 = std::move(t);

Tunabrain · on July 12, 2023

I think there may be a misunderstanding here regarding the use case. If the vectors are large and allocated on the heap/on an accelerator, then yes, writing out explicit temporaries may be faster. Of course, this does not preclude operator overloading at all: You could write the same code as auto t = vec1 - vec2; t *= 0.5/0.3; t += vec3; t *= 0.3;

However, if the operands are small (e.g. 2/3/4 element vectors are very common), then "unnecessary copies" or move semantics don't come into play at all. These are value types and the compiler would boil them down to the same assembly as the code you post above. Many modern C++ codebases in scientific computing, rendering, or the game industry make use of vector classes with operator overloading, with no performance drawbacks whatsoever; however, code is much more readable, as it matches actual mathematical notation.

spacechild1 · on July 12, 2023

Thank you for putting this so eloquently!

> Many modern C++ codebases in scientific computing, rendering, or the game industry make use of vector classes with operator overloading, with no performance drawbacks whatsoever

I guess these people are all not writing "serious code" :-p

_gabe_ · on July 12, 2023

TIL Box2D must not be serious code because it doesn't use copious amounts of explicit temporaries[0].

And just for the record, I'm very glad Erin Catto decided to use operator overloading in his code. It made it much easier for me to read and understand what the code was doing as opposed to it being overly verbose and noisy.

[0]: https://github.com/erincatto/box2d/blob/main/src/collision/b...

spacechild1 · on July 12, 2023

> One never writes such expression in a serious code.

Oh please, because you know exactly which kind of code I write? I'm pretty sure that with glm::vec3 the compiler can optimize this just fine. Also, "vec" could really be anything, it is just a placeholder.

That being said, if you need to break up your statements, you can do so with operators:

    auto t = vec1 - vec2;
    t *= 0.5/0.3;
    t += vec3;
    t *= 0.3;

Personally, I find this much more readable. But hey, apparently there are people who really prefer free functions. I accept that.

pjmlp · on July 11, 2023

Assuming mul does actually a multiplication.

throwaway894345 · on July 12, 2023

It doesn’t matter, you can look it up easily because without overloading, you know exactly which implementation will run.

pjmlp · on July 12, 2023

Ah, just like I do when I press F12 over an operator.

enriquto · on July 12, 2023

You come across as a bit facetious here.

Of course, the compiler or an advanced IDE can know what your code means. If all your identifiers were random permutations of l and I: lIllI1lI, your IDE would not mind either, but the code would be horrific, don't you agree? The point of the OP is that overloaded operators (and functions) make it harder to reason about the code for a human that reads it. At least for some people. At the end, everything is "just" syntactic sugar, but it makes a significant difference.

throwaway894345 · on July 12, 2023

Exactly. If you don't care that the code is unreadable and you can rely on every human viewing the code through an IDE with symbol resolution (and not say, online code review platforms) and remembering to use said symbol resolution to check every operator, then operator overloading is great!

MadcapJake · on July 11, 2023

If editors were to implement it, you could navigate to the corresponding overload implementation or even provide some hint text. Just like they do for other functions.

throwaway894345 · on July 12, 2023

Yeah, we would need editors and code review tools to not only follow overloads to their functions but also highlight that the operator is overloaded in the first place. Of course, this is quite a lot more work than just not overloading things in the first place (particularly since the benefit of operator overloading is negligible).

pjmlp · on July 12, 2023

In Visual Studio it is already there.

eesmith · on July 11, 2023

Dealing with money is important, even if it's only a small part of mathematics. I'll focus on that.

Python's 'decimal' module uses overloaded operators so you can do things like:

  from decimal import Decimal as D

  tax_rate = D('0.0765')

  subtotal = 0
  for item in purchase:
     subtotal += item.price * item.count # assume price is a Decimal
  
  taxes = (subtotal * tax_rate).quantize(D('0.00'))
  total = subtotal + taxes

Plus, there's support for different rounding modes and precision. In Python's case, something like "a / b" will look to a thread-specific context which specifies the appropriate settings:

  >>> import decimal
  >>> from decimal import localcontext, Decimal as D
  >>> D(1) / D(8)
  Decimal('0.125')
  >>> with localcontext(prec=2):
  ...   D(1) / D(8)
  ...
  Decimal('0.12')
  >>> with localcontext(prec=2, rounding=decimal.ROUND_CEILING):
  ...   D(1) / D(8)
  ...
  Decimal('0.13')

Laws can specify which settings to use, for examples, https://www.law.cornell.edu/cfr/text/40/1065.20 includes "Use the following rounding convention, which is consistent with ASTM E29 and NIST SP 811",

  (1) If the first (left-most) digit to be removed is less than five, remove all the appropriate digits without changing the digits that remain. For example, 3.141593 rounded to the second decimal place is 3.14.

  (2) If the first digit to be removed is greater than five, remove all the appropriate digits and increase the lowest-value remaining digit by one. For example, 3.141593 rounded to the fourth decimal place is 3.1416.

   ... (I've left out some lines)

and from https://www.law.cornell.edu/cfr/text/7/1005.83 :

  (3) Divide the result in paragraph (a)(2) of this section by 5.5, and round
  down to three decimal places to compute the fuel cost adjustment factor;

  (4) Add the result in paragraph (a)(3) of this section to $1.91;

  (5) Divide the result in paragraph (a)(4) of this section by 480;

  (6) Round the result in paragraph (a)(5) of this section down to five decimal
  places to compute the mileage rate.

There's probably laws which require multiple and different rounding modes in the calculation.

This means simply doing all of the calculations in scaled bigints or as fractions won't really work.

Now of course, you could indeed handle all of this with prefix functions and with explicit context in the function call, but it's going to be more verbose, and obscure the calculation you want to do. I mean, it's not seriously worse. Compare:

  with localcontext(prec=3, rounding=decimal.ROUND_DOWN):
    line3 = line2 / D("5.5")
  line4 = line3 + D("1.91")
  line5 = line4 / 480
  line6 = line5.quantize(D('.00001'), rounding=decimal.ROUND_DOWN)

vs. some function-based API with overloaded parameter types:

  line3 = decimal_div(line2, D("5.5"), prec=3, rounding=decimal.ROUND_DOWN)
  line4 = decimal_add(line3, D("1.91"))
  line5 = decimal_div(line4, 480)
  line6 = decimal_quantize(line5, D('.00001'), rounding=decimal.ROUND_DOWN)

But it is worse. I also originally made a typo in the function-based API for line5 where I used "decimal_add" instead of "decimal_div" - the symbols "/" and "+" stand out more, and are less likely to be copy&pasted/auto-completed incorrectly.

If overloaded parameters - "spooky action at a distance vibes" - also aren't allowed, then this becomes more rather more complicated.

kazinator · on July 11, 2023

mul(x, y)

can be overloaded! x and y could be any cobination of matrix, vector, or integer, float.

You're mixing up overloading (which is semantics) with syntax.

spacechild1 · on July 12, 2023

We all know that functions can be overloaded. The emphasis is on operator overloading. So yes, it is about syntax.

dekhn · on July 11, 2023

plus(x, y) IS mathematical notation. It's Polish Notation without symbol shortcuts. Mathematica has used this for decades.

spacechild1 · on July 12, 2023

sure, I was a bit sloppy there, but I think you know what I meant. Polish notation is just not how most people write arithmetic expressions :)

danzk · on July 11, 2023

It's Polish Notation actually.

dekhn · on July 11, 2023

Thanks, corrected in place.

slavik81 · on July 11, 2023

Why have operators at all? If that notation is good enough, then you might as well use it for the built-in types too. We're halfway to designing a Lisp!

ThemalSpan · on July 11, 2023

Sorry but please don't take Eigen (https://eigen.tuxfamily.org) away from me. Can't speak for others, but the scientific code I work on would become unreadable like that.

sva_ · on July 11, 2023

The site seems to be down

https://downforeveryoneorjustme.com/eigen.tuxfamily.org?prot...

dleslie · on July 11, 2023

Even better:

    (plus 
      (mul vec1 vec2)
      (mul vec3 vec4))

Lisp is love.

Koshkin · on July 11, 2023

Love is cruel; you could fall for a goat (a Russian proverb)

pjmlp · on July 12, 2023

I rather type,

    (+
       (* vec1 vec2)
       (* vec3 vec4))

Lisp is love indeed.

pjmlp · on July 11, 2023

How can you be sure that plus() and mul() actually do what they say?

Etherlord87 · on July 11, 2023

I think in this context what's important is if these functions are overloaded or not…

pjmlp · on July 11, 2023

Which in any case also means the only way to be sure is to track down the implementation, the actual character set hardly makes a difference.

enriquto · on July 11, 2023

I agree with you. Function overloading is just as bad as operator overloading.

pjmlp · on July 12, 2023

You still need to validate if the implementation matches the name, no matter what.

throwaway894345 · on July 11, 2023

You read the code. And unlike operator overloading, you know at a glance exactly which implementation to look at. There is no spooky action at a distance.

joshuamorton · on July 11, 2023

How, the functions might support dynamic dispatch or be overloaded based on types.

There's nothing that prevents me from implementing all of

    plus(Int, Int)
    plus(Int, Vec)
    plus(Int, Mat)
    plus(Vec, Vec)
    plus(Vec, Mat)
    plus(Mat, Mat)

and to know which `plus` is being dispatched, you need to know the types of both arguments, exactly the same as if `plus` is named `__add__` in python or `operator+` in C++.

throwaway894345 · on July 12, 2023

Yes, overloading is bad (operator overloading is a kind of overloading).

joshuamorton · on July 12, 2023

How do you do math without overloading? I have ints, floats, bignums, vectors, matrices, complex numbers, and potentially fractions.

You want every operation to have a distinctly named version for each other numeric type?

Maybe you can reduce some of those, but you can't really have interfaces without some overloading.

pjmlp · on July 11, 2023

Exactly the same thing with functions that happen to use funny characters.

throwaway894345 · on July 12, 2023

Nah, there’s no spooky action at a distance when you don’t overload. You know concretely what code is running.

pjmlp · on July 12, 2023

Have you ever heard of linkers?

throwaway894345 · on July 12, 2023

Have you ever heard of potatoes? Neither of them help you tell at a glance which code is being dispatched to.

coldpie · on July 11, 2023

They're functions. Whatever they do, my code will go execute some code from whatever library implements them, which is what a function does. I just want to be able to rely on [] being an array subscript when I read some unfamiliar code. Is that too much to ask?

pjmlp · on July 11, 2023

Just like the function that happens to be named [].

kristiandupont · on July 12, 2023

Can IDE's detect this and offer "go to implementation" on an overloaded operator these days? Because besides from the surprise-element in the fact that there even is code to debug hiding somewhere, not being able to quickly navigate to it is much worse in my opinion. And with infix operators where you can't even be sure which operand the implementation belongs to, figuring it out can be a bit of a detective task.

pjmlp · on July 12, 2023

Visual Studio does show a popup, go to implementation, and if the implementation happens to be visible on the same screen, also highlights it.

circuit10 · on July 11, 2023

If you’re using it right then [] is still an array subscript, just with a custom implementation for getting array elements

bigbillheck · on July 11, 2023

Function overloading's not that much better.

Q6T46nT668w6i3m · on July 11, 2023

Yes (I’m a numerical analysis researcher and wrote a handful of ubiquitous mathematical packages)! The implementation of even primitive types can vary considerably! It’s way too much to hide. Nevertheless, I understand I’m biased. :)

nunuvit · on July 12, 2023

In engineering practice, we often start using math without first consulting numerical analysts. It takes a long time to identify and fix the inevitable issues, which eventually becomes a lesson we have to teach students and practicing engineers because the field has accumulated so much historical baggage from doing it the wrong way.

As an example, early device models for circuit simulation were not designed to be numerically differentiable, leading to serious numerical artifacts and performance issues. Now we have courses dedicated to designing such models, and numerical analysis is used and emphasized throughout.

Is there anything today that you look at and think "yeah, they're gonna need to fix that at some point"?

kristiandupont · on July 11, 2023

Vectors and perhaps matrices are about the only valid use case I have ever come across, so I agree with GP that it's not worth it. And that's speaking as someone who once implemented a path class with a subtraction operator that would return the relative path between two absolute ones. I thought I was very clever. I feel sorry for the developers who had to figure out what that was about after I left..

coldpie · on July 11, 2023

Ah, matrices! Does * mean dot product, cross product, or the usually less useful matrix multiplication? Ooh, or maybe you should you use . for dot! How clever that would be!

> And that's speaking as someone who once implemented a path class with a subtraction operator that would return the relative path between two absolute ones. I thought I was very clever.

Haha! It's ok. The temptation to be clever with operators is too strong, few can resist before getting burned (or more usually, burning others!) at least once.

owalt · on July 11, 2023

> Ah, matrices! Does * mean dot product, cross product, or the usually less useful matrix multiplication? Ooh, or maybe you should you use . for dot! How clever that would be!

Why the snark? The fact that you're free to make a bad choice does not imply that having a free choice must be bad. Obviously neither dot nor cross product should be *. It should be the Hadamard product or matrix multiplication. You can choose one convention for your code and be perfectly happy for it.

As a follow-up question: How do you feel about languages like Fortran and Matlab then? Is it actually a good thing that mathematics convenience features are relegated to a few math-oriented languages and kept away from all the others? (Or are the linear algebra types in these languages offensive as well?)

woooooo · on July 11, 2023

The benefits from operator overloading are "I can show this to a mathematician and it looks like what they're used to". The downsides lurk in the corners of whether it's actually doing what you think.

I'll pass.

kps · on July 11, 2023

Maybe C++26 will let us write operator·() and operator×().

enriquto · on July 11, 2023

In C++34 we'll finally have a way of overloading the empty string operator, so that we can, at last, write AB for the product of matrices A and B. As God intended.

andrewf · on July 12, 2023

Stroustrup, B. (1998, April 1) Generalizing Overloading for C++2000. AT&T Labs. https://www.stroustrup.com/whitespace98.pdf

kazinator · on July 11, 2023

Overloading is orthogonal to the issue you're striking at: infix operators versus postfix function calls. Functions can be overloaded just like operators.

Etherlord87 · on July 11, 2023

What if you could type the asterisk to multiply vectors, but then your editor of choice would replace it with the symbol that actually means vector multiplication?

Perhaps that idea falls apart once you realize you would need hundreds of symbols for just addition…

But what if those symbols were (automatically) imported and named at the beginning?

Perhaps it would be annoyingly inconsistent how in various files different symbols are used for the same operation…

JohnFen · on July 11, 2023

> then your editor of choice would replace it

I realize this is a minority opinion, but I don't want my editor to replace anything unless I've deliberately told it to.

Etherlord87 · on July 12, 2023

The idea is, operator overloading is a convenience feature. Why not have that convenience as an option in an editor, without influencing the language? If you want scalar multiplication to look the same as vector multiplication, set it in your editor. If you want to insert scalar multiplication with the same key you insert vector multiplication, set it in the editor (to figure out which you mean, based on context, when you press that key).

Just to be clear, I'm not being a smartass, just considering this as an option and wondering if the HN crowd has some thoughts on this.

gpderetta · on July 11, 2023

It could annotate the view without changing the source.

For example my editor annotates auto with inferred types and function parameters with parameter names.

duped · on July 11, 2023

No but I think

    sum({ mul(vec1, vec2), mul(vec3, vec4) })

definitely is.

achates · on July 13, 2023

I'd prefer MultiplyVecs(v1, v2) + MultiplyVecs(v3, v4) myself. Makes it easy to jump to the function definition and see exactly what's going on.

JohnFen · on July 11, 2023

Sure, there is a place for operator overloading.

That said, in my experience over the decades, operator overloading has been one of the primary causes of bugs that are very hard to pin down, so I have come to hate it. It hides far too much.

The cost/benefit ratio of operator overloading is generally unfavorable in practice, in my experience. Which is not to say it shouldn't be used when it actually clarifies things! But those situations tend to be fairly niche.

Interestingly, where I work right now, using operator overloading is specifically prohibited. So I'm confident that my dislike of the practice is not just a personal quirk.

fsloth · on July 11, 2023

That't literally the only place where operator overloading makes any sense.

In other places they monkey patch c++ defincies as a language.

And they are confusing and error prone.

Nobody is pretending we will get rid of any c++ syntax ever. So the discussion is about a hypothetical language syntax that fits C++ slot.

In that world C++ would have N x M matrices as native value types in the language (as fortran does) and those operators would be defined in the language spec for matrix types just as they are defined for standard number types at the moment.

And it would not have any operator overloading.

crickey · on July 11, 2023

Making such assumptions about what is correct or proper use is why c++ is so successful, it doesnt make assumptions it leaves it up to the project / community using it. Go ahead make a language that dictates alot and makes srict assumptions it will be depricated or forced to open up before the end of the decade. Notr this is why i think python and lisp is so populare meta programing is very powerful and expressive.

fsloth · on July 12, 2023

The fact C++ has so many ways of defining things is not the reason it's so popular. The reason is the enormous industry investment on the language tooling and ecosystem. IMO the language itself is the worst part of the ecosystem, but the other parts create a totality that is the best development language ecosystem in industry for my niche (graphics and geometry) including libraries, copmpilers, debugger & profiling etc.

Any language with the level of industrial support C++ has had would have grown to prominence. C++ came abut a judicious time in history when "object orientation" was becoming the latest buzzword. And now we have ended up with gazillions of lines of C++ code.

It's a tragedy of our trade that two mongrels - C++ and Javascript - became to be among the most prominent in our trade.

crickey · on July 12, 2023

But the reason C++ fits in so many industries from embedded system to high level gui libraries is its flexibility we see the end of OOP trend but C++ does bot lock its users into one paradigm or another so it will continue to be industry standard. Even if the industry is moving towards other paradigms of programing.

Adding, javascript really inly has one industry its used in. Think it sais a bit about its versatility

suby · on July 11, 2023

That's a good observation about C++ not making assumptions, it strikes me as true. C++ apparently doesn't even make assumptions about what the C++ filename extension is. .h, .hh, .hpp, .hxx, .C, .cc, .cpp, .cxx, .ixx, cppm

eesmith · on July 11, 2023

> That't literally the only place where operator overloading makes any sense.

So, no overloading for decimal types? Nor fraction types?

I may be the oddball here, but I've used operator overloading on those types more than I have for vectors and arrays.

doodpants · on July 11, 2023

> That't literally the only place where operator overloading makes any sense.

That may be true for C++ (I'll take your word for it), but not for all programming languages in general. For example, in C# it's fairly common to overload == and != to implement value equality for reference types (classes).

Of course, you should really only do this for immutable classes that are mostly just records of plain old data. And C# 9 introduced record classes, which is a more convenient way of defining such classes. But record classes still overload these operators themselves, so you don't have to do it manually.

throwaway894345 · on July 11, 2023

Honestly that sort of thing always confused me when I worked in Java, C#, etc. I could never tell at a glance whether the operator was doing an identity comparison or a value comparison, and I definitely contributed a few bugs from this misunderstanding. In Go which lacks operator overloading, we either `ptr1 == ptr2` or we do some `ptr1.Equals(ptr2)` for value comparisons and `ptr1 == ptr2` for pointer comparisons--in either case, there is no ambiguity and IME fewer bugs.

gpderetta · on July 11, 2023

That's more of a problem with languages that confuse value types and references.

wizofaus · on July 11, 2023

Java's the same regarding == and .equals( ) and when it's Java code written by devs who also work in other languages, it definitely still results in bugs, sometimes that go undiscovered for remarkably long times (particularly if == happens to return the right result in most cases). Meaning/needing to compare references for string (and similar) types is exceedingly uncommon, yet uses the more "natural" syntax for testing equality. FWIW I can't remember working with a codebase where unexpected behavior due to operator overloading was a serious problem.

noduerme · on July 11, 2023

What about the assignment operator?

kazinator · on July 11, 2023

Fortran now has operator overloading, though.

dekhn · on July 11, 2023

In a certain pedantic, functional way, yes. For fast visual parsing, no.

AndrewStephens · on July 11, 2023

> Fight me.

OK.

Operator overloading is a useful feature that saves a bunch of time and makes code way more readable.

You can quibble whether operator<<() is a good idea on streams and perhaps C++ takes the concept too far with operator,() but the basic idea makes a lot of sense.

    string("hello ") + string("world");
    complexNumber2 * complexNumber2;
    for (int i : std::views::iota(0, 6)
               | std::views::filter(even)
               | std::views::transform(square))
    someSmartPtr->methodOnWrapperClass();

fsloth · on July 11, 2023

The majority of time in professional codebases is not spent on typing but reading and understanding code.

"saves a bunch of time and makes code way more readable"

Not when everybody defines their own operators.

Note - we are discussing operator overloading, not operators as features in syntax. Operators at the syntax level make life a lot easier. But then everybody uses the exactly same operator semantics, not some weird per-project abstraction.

The lines of code you wrote as an example are not saving anyones time, except when writing it if you are a slow typist and lack a proper IDE support for C++. If typing speed is an issue, get a better IDE, don't write terser code.

joshuamorton · on July 11, 2023

Code is read more often than written. Writing code that can be understood at a glance (by using common, well understood operators) optimizes for readability.

I think your argument is basically "people should not aggressively violate the implicit bonds of interfaces", which is true. But that goes for all interfaces, not just and not in particular those around operators.

We just have cases where it's common with operators because those are one of the few cases where we have lots of things that meet the interface and interact directly as opposed to hierarchically. The same kind of issue comes up with co/contravariant types and containers sometimes, but that's less often visible to end developers.

cogman10 · on July 11, 2023

I tend to agree with this. I like operator overloading for mathematical constructs (like complex numbers or even just for conversions of literal types, Imagine, for example, you have a gram type and a second type, if you said 1g / 1s you'd get 1gps, that seems reasonable)

I don't like it in the example given

    for (int i : std::views::iota(0, 6)
               | std::views::filter(even)
               | std::views::transform(square))

What benefit does this have over the Javay/Rusty version that looks like this

    for (int i : std::views::iota(0, 6)
               .filter(even)
               .transform(square))

?

No deducing what `|` means, you are applying the filter then transform function against the view.

jenadine · on July 11, 2023

Hiding the actual logic in extra boilerplate doesn't make it easier to read.

The lines of code in the example save reader's time as they focus attention on the actual business logic.

Yes, this assumes that operator overloading follows some convention, but you need to follow conventions regardless to make readable code.

woooooo · on July 11, 2023

Making things explicit isn't necessarily boilerplate.

lost_tourist · on July 11, 2023

by this reasoning should we get rid of templates/generics as well?

enriquto · on July 11, 2023

JonChesterfield · on July 12, 2023

People don't use the same operator semantics. Is + commutative? Does it raise exceptions or sometimes return sentinels? What type conversions might == do?

SubjectToChange · on July 12, 2023

And how exactly do you propose library authors should work with user-defined types? Operator overloading is what allows algorithms to be efficiently generalized across types.

throwaway894345 · on July 11, 2023

The code isn't readable (you can't even reliably tell at a glance what the operator does) and it takes negligibly longer to write "add()" rather than "+" in your program (yes, 'add()' is more keystrokes and thus takes longer to type, but most of your program isn't addition instructions).

I think what people should advocate is full DSL capabilities with some unambiguous gate syntax so people know precisely that `foo * bar` is not using the host language syntax. Overloading operators is ambiguous and vastly incomplete (everyone is holding up matrix math as the shining example for the utility of operator overloading and you can't even express dot product notation in C++!)--it's a hack at best.

josefx · on July 11, 2023

> The code isn't readable (you can't even reliably tell at a glance what the operator does) and it takes negligibly longer to write "add()" rather than "+" in your program (yes, 'add()' is more keystrokes and thus takes longer to type, but most of your program isn't addition instructions).

Except now you replaced + with a name that tells you just as much/little as + does. So you made your program verbose for the sake of verbosity.

throwaway894345 · on July 12, 2023

No, you’ve made your program “verbose” (by a handful of characters) for the sake of clarity—there is no longer ambiguity about what code runs (of course, this assumes you aren’t similarly overloading named functions, which should also be disallowed).

josefx · on July 12, 2023

Given that your example uses neither namespacing nor explicit typing you seem to preach do as I say, not as I do.

throwaway894345 · on July 12, 2023

I don't know what example you're talking about. I haven't given an example in this thread. I think you may have me confused with another commenter.

josefx · on July 12, 2023

so that throwaway suggesting that add is barely longer than + wasn't you?

throwaway894345 · on July 13, 2023

That was me, but I didn’t provide example code that would require namespaces. I don’t understand his your earlier comment makes sense in the context of this thread.

JohnFen · on July 11, 2023

Operator overloading is like spicy peppers. A little bit can make the program better, but it's very easy to use too much.

cogman10 · on July 11, 2023

Unfortunately, like spicy peppers, everyone's definition of "too much" is different. Some people are eating ghost chili peppers just fine while others are struggling with ketchup.

Leo_Germond · on July 12, 2023

Which is why we've banned spicy food: it's just too risky to be used safely.

coldpie · on July 11, 2023

All the things you wrote could be about as easily written & much more easily read without operator overloading. Operator overloading only allows programmers to feel "smart" for doing a "clever" thing, to the detriment of future readers.

    string("hello ").append("world");

    complexNumber2.mult(complexNumber2);

    // wtf is even going on with this one in your example? have these people never heard of method chaining?
    for(int i : std::views::iota(0,6).filter(even).transform(square))
        (*smartPtr)->methodOnWrapperClass();

That's all about the same verbosity, it's much more clear to the reader even if they're unfamiliar with your codebase, and dropping operator overloading eliminates the "clever" option to do stupid crap like divide file path objects together.

AndrewStephens · on July 11, 2023

Would you advocate getting rid of operators altogether?

    3.times(2).plus(7)

Some things just lend themselves to being expressed in terms of simple operators.

    (*smartPtr)->methodOnWrapperClass();

That is still using the overloaded SmartPtr<>::operator*() method.

I understand the viewpoint that operator overload is syntactic sugar for things that can easily be done another way, I just disagree that costs outweigh the benefits.

coldpie · on July 11, 2023

> Would you advocate getting rid of operators altogether?

Of course not. It makes sense for built-in types, as everyone reading the code can be assumed to know them.

> That is still using the overloaded SmartPtr<>::operator*() method.

Good catch ;)

> I just disagree that costs outweigh the benefits.

Yah, I think that's the disagreement. My feeling is there's a teeny, tiny handful of appropriate places for it (almost entirely math) and it opens up a pandora's box of terrible decisions that programmers clearly find irresistible.

JonChesterfield · on July 12, 2023

Are you suggesting

  3.times(2).plus(7)

as a good thing or a bad thing? I see a.equals(b) occasionally from the first argument is magic crowd but 3.times is novel here. I'm really unsure what the order of operations is for that expression.

gpderetta · on July 11, 2023

C++ doesn't have extension methods, so you wouldn't be able to add custom algos to the set of chainable algos.

Overloading operator| is a crude way to get the equivalent of extension methods without having to change the language.

crickey · on July 11, 2023

“Fried shrimp should be removed from the all you can eat chinese buffay because i cant help myself from eating at least 20 of them in a single sitting and now i have stomach cramp”

coldpie · on July 11, 2023

The very first Hello World program anyone learning C++ will write uses the godawful iostream bitshifting operators! Not even the language's authors could help themselves eating 20 fried shrimp on the first day the buffet was open!

qalmakka · on July 11, 2023

If we were set on banning all of the features C++ abused from new languages, we would probably end up writing just empty brackets.

GuB-42 · on July 11, 2023

And even then, some Python fan will come in and this is what you will end up with.

https://en.wikipedia.org/wiki/Whitespace_(programming_langua...

JohnFen · on July 11, 2023

This!

The iostream bitshift overload was one of the first features of C++ that I learned to despise. I'm very happy that there's an alternative in the new version.

crickey · on July 11, 2023

Agreed the gluttons

Hemospectrum · on July 11, 2023

How far are you prepared to take this stance, exactly? C has operators that are generic over both integral and floating point types. Was that a mistake? Did OCaml do it better?

For my part, I've been persuaded that generic operators like that are a net win for math-heavy code, especially vector and matrix math. Sure, C++ goes too far, but there are middle grounds that don't.

fsloth · on July 11, 2023

Having operators defined for value types within the langauge spec is different thing than defining operator overloading for arbitrary struct and class types.

For numeric value types mathematical operators are the only sane option.

For arbitrary classes - not so much.

A sane language in the slot of C++ in the language ecosystem would not have operator overloading. It would have matrix types defined in the language spec with mathematical operators operating on them.

repsilat · on July 11, 2023

One part of the philosophy of the language maintainers is that they're somewhat humble about their designs in the standard library, and very much against breaking changes.

Some folks prefer absl's flat_hash_map over std::unordered_map for a hash table, and it's not great that you need to choose or risk having both in a codebase, but it _is_ nice that you can have your preferred hash table and use operator[] whichever you decide.

Python also has operator overloading, and people seem to like that numpy can exist using it. And container types. Weirdly doesn't cause much consternation compared to C++ (maybe because the criticisms of the latter come from C programmers?)

I've occasionally missed overloading in JS/TS though.

cogman10 · on July 11, 2023

> It would have matrix types defined in the language spec with mathematical operators operating on them.

This is unfortunately impossible (IMO). The problem is matrixes have multiple operations that don't translate nicely like complex numbers do. If you want to be consistent, you have to pick and choose What A * B means, under which contexts, and when is that illegal (or what should happen on an error).

For complex numbers, there's only one definition of A * B that matters and no failure cases.

I fear there's not clean way to do matrix operations that won't make some community really irritated for choosing "wrong". (Physics, engineering, science, etc.)

gjadi · on July 12, 2023

Speaking of OCaml, I like the local namespace import:

MyVecModule(v1 * v2 + v3)

Easy to read and explicit enough to know which functions is used.

https://dev.realworldocaml.org/files-modules-and-programs.ht...

JonChesterfield · on July 12, 2023

C's integer promotion and conversion rules are definitely terrible. Implicit conversions bad.

pphysch · on July 11, 2023

Operator overloading is critical for building ergonomic frameworks.

The modern web is built on overloading the . operator (e.g. ORMs like Rails and Django). We will never see a Tier-1 ORM in Golang simply because it lacks it.

Gamedev, AI also benefits heavily from it.

KRAKRISMOTT · on July 11, 2023

> Operator overloading is critical for building ergonomic frameworks. The modern web is built on overloading the . operator (e.g. ORMs like Rails and Django). We will never see a Tier-1 ORM in Golang simply because it lacks it.

Not true. It can be replaced with codegen.

See Ent (spun out of Facebook I believe)

https://entgo.io/docs/getting-started

pphysch · on July 12, 2023

As I said, there won't be a Tier-1 ORM in Go. Ent or Gorm are tier-2 at best. They can get the job done, but it ain't pretty.

Any advantages of Go (and there are many) are outweighed by the fact that you have to write and read 2x more code to be equally productive as Rails or Django.

KRAKRISMOTT · on July 13, 2023

You should actually try Ent, it's about as good as commercial ORMs like Prisma.

65a · on July 12, 2023

> We will never see a Tier-1 ORM in Golang

That sounds like a good thing, having dealt with Hibernate in production. As a backend developer, I'm pretty happy with C++17 (and beyond), Go and Rust. All of them can be used in fairly explicit ways, which means debugging a problem is easy, and performance issues are right there on the page if any. I want less magic, not more.

pphysch · on July 12, 2023

> I want less magic, not more.

Magic is magic until it becomes understood, then it is science.

While I don't want junior programmers wielding the dark magic of operator overloading, I trust that the engineers behind Django are using it reasonably.

Aardwolf · on July 11, 2023

I'll byte: complex numbers and matrix support is bad in languages without operator overloading. Why should only the primitive types of the language be privileged to proper math notation?

anon291 · on July 12, 2023

Not having operator overloading is anti-human. To think so highly of yourself that there is no other thing that can properly be the subject of the field operators (or other basic operators) is the height of hubris. The compiler typically must handle the operators on certain types due to the compilation target's semantics, but in reality, there's nothing special about these 'built-ins'.

Operators like +, -, /, *, etc have meanings independent of integers and floats and to not allow these meanings to be expressed is sad.

I've heard many programmers express this sentiment and what they actually are attempting to argue is that having overloads of these operators that do not respect the corresponding group, ring, or field laws is confusing. This I agree with. Operators should mainly follow the proper abstract semantics.

qalmakka · on July 11, 2023

BS. I thought that Java already demonstrated to the world how dumb it is to disallow operator overloading altogether.

Allowing ANY operator to be overloaded was dumb, like C++ did, where you could do batshit crazy stuff like overloading unary & (addressof) or the comma operator (!), or stuff like the assignment operator (that actually opens a parenthesis about how copy/move semantics in C++ are a total hack that completely goes OT compared to this).

Sensible operator overloading makes a lot of sense, especially when combined with traits that clearly define what operations are supported and disallow arbitrary code to define new operators on existing types. Rust does precisely that, and IMHO it works great and provides a much nicer experience than Java's verbose mess of method chaining.

throwaway2037 · on July 12, 2023

I'm on your side, but only after many years of being on the other side. I used to think they were "graceful" and "minimalist", and refused to acknowledge they can be the source of many surprises.

The Google C++ style guide has a very nice overview. There are only two pros listed, and large number of cons. And this document is old by Internet (dog) years -- at least 10 years.

Ref: https://google.github.io/styleguide/cppguide.html#Operator_O...

ksherlock · on July 11, 2023

Consider the humble + operator. In most compiled languages -- even those that don't support operator overloading -- it is in fact overloaded. int + int. long + long. float + float. double + double. pointer + int. Would every language be better with it?

Built in operators don't always map 1-1 to CPU instructions so don't appeal to that authority. There are still plenty of CPUs -- old and new -- without multiplication, division, or floating point support.

sva_ · on July 11, 2023

Counter example: Any Tensor library ever

enriquto · on July 11, 2023

You could argue that there is just one type (tensor) with some invalid operations between its values (e.g., when dimensions mismatch). Just like integer division by zero.

zadjii · on July 11, 2023

I personally love glm[1] and the way you can just

    vec3 foo{1, 2, 3};
    vec3 bar{2, 0, -2};
    auto baz = foo + bar; // {3, 2, 1}

[1]: https://github.com/g-truc/glmhttps://github.com/g-truc/glm

tomcam · on July 11, 2023

Corrected link:

https://github.com/g-truc/glm

variadix · on July 11, 2023

I disagree, it’s heavily abused but very useful for types where it’s obvious what the operation is (inherently mathematical types like vectors and matrices). I wrote a macro library for C that vector/matrix math in prefix notation with _Generic overloads and it’s still too clumsy to get used to.

throwaway894345 · on July 11, 2023

> where it’s obvious what the operation is (inherently mathematical types like vectors and matrices)

Considering there are like 3 different types of matrix multiplication operations, I don't think it's obvious at all. Feels like you should either use a language with complete support for implementing custom DSLs (that can express the whole domain naturally) or eschew ambiguous operator overloading altogether (gaining consistency and quality at the expense of a few keystrokes).

suremarc · on July 11, 2023

I think we all know what someone means when they say “matrix multiplication”. Asserting that * could mean, say, the Hadamard product or the tensor product is a reach. In practice I have never seen it mean anything else for matrices.

DSLs just push the complexity away from the language into someone else’s problem in a way that has much higher sum complexity. You’re making authors of numerical libraries second-class citizens by doing so. For some languages that’s probably not a bad choice (Go is one example where I don’t feel the language is targeted at such use cases).

Also, the lack of a standard interface for things like addition, multiplication, etc. means that mathematical code becomes less composable between different libraries. Unless everyone standardizes on the same DSL, but I find this an unlikely proposition, given that DSLs are far more opinionated than mere operator overloads.

blackpill0w · on July 11, 2023

I've never understood why people complain a lot about `std::cout << "string"`, if the problem is that this operator is used for bit shifting, simply stop thinking that way (genius I know), do you think of addition when you see `string + "concatenate"`? Operator overloading is awesome, and like everything in programming, if used correctly; constructing paths with / is sweet, and I find << with streams visually appealing and expressive, it's feeding data to the stdout/file/etc, same for `std::cin >> var`, data goes from the stdin to the variable.

enriquto · on July 11, 2023

> do you think of addition when you see `string + "concatenate"`

Yes. And it tortures me every time.

I religiously avoid string concatenation in Python for this very reason. It's not that "+" necessarily means addition; it's that it always means a commutative operation (to somebody who has learned some algebra). String concatenation is notoriously non-commutative, thus it is extremely disturbing to write it using a visibly commutative operator. Any other operator except "+" would be better. For example, a space, or a product, or a hyphen. Whatever. But please, not a commutative operator. It breaks my brain parser.

Dylan16807 · on July 17, 2023

It's also one of the biggest sources of bugs when dealing with loose types around strings and numbers.

When it comes to languages that let you mix strings and numbers, Lua has it right. + always adds, and accepts numbers and strings that can cleanly convert to numbers. .. always concatenates, and accepts numbers and strings.

panza · on July 11, 2023

This is a really good point. It bothers me too, but I hadn't really articulated why.

i-use-nixos-btw · on July 11, 2023

Aside from the syntax (I find it ugly, you find it visually appealing - it's subjective), iostreams are inefficient, awkward to customise, not thread safe (allows interleaving), and mix format and data (that one is also subjective).

I love me some operator overloading. I love / for filesystem separators, I love | for piping things. I don't like << and >> so much but that's just because of too many years of writing them everywhere.

projektfu · on July 11, 2023

In C++, with its templates, there are only a couple alternatives:

1. Operator overloading

2. Operator desugaring (e.g. __subscript__(), which substitutes the intrinsic function for basic types, but can also be defined for user defined types)

3. Writing templates with weird adaptors for primitive types.

Given that its design goal was to embed C, there were already operators that worked with various and mixed types. Adding (+.), etc., would have been unacceptable to the users. So, I think in general, for this language, it was good but, unfortunately, iostream made people think you should overload the behavioral expectation, too.