... Except the fact that printf/scanf use variadics, and the only reason why it stopped being a constant source of crash is the fact that compilers started recognizing it and validating format strings/complaining when you pass a non-literal string as a format.
<format> is instead 100% typesafe. If you pass the wrong stuff it won't compile, and as {fmt} shows you can even validate formats at compile time using just `constexpr` and no compiler support.
As always, people making more fuss around it than necessary. Code calling printf() with a constant format string literal is this class of code that you have to run a single time to know it works. Many C++ programmers have always been using printf() in preference to e.g. std::cout because of ergonomics. And they were right.
It's hard to take people seriously that try to talk it down for being a pragmatic solution that's been around for probably 30-40 years.
I've definitely written bugs where the format specifier is wrong. Maybe you used %lu because a type happened to be unsigned long on your system, then you compile it for another system and it's subtly wrong because the type you're printing is typedef'd to unsigned long long there. Printing an uint32_t? I hope you're using the PRIu32 macro, otherwise that's also a bug.
This. I have corrected uncountable lines of code where people just used "%d" for everything, potentially opening a massive can of worms. `inttypes.h` is what you should be using 99% of the time, but that makes for very ugly format strings, so basically nobody uses that. Otherwise you should cast all of your integer params to (long long int) and use %lld, which sucks.
Yes, this is annoying. Integer promotions can be annoying in general.
I'm often working with fixed size types, and still find myself using %d and %ld instead of PRIu32 etc most of the time, because it's simply easier to type. If unsure about portability issues it can help to cast the argument instead. But realistically it isn't much of an issue, printfs seem to be about 0.5% of my code, and >> 90% of them are purely diagnostic. I don't even recall the last time I had an issue there. I remember the compiler warning me a few times about mismatching integer sizes though.
I agree. Layers of legacy typedefs in the Win32 API always catch me off guard. Any large source base with lots of typedefs, it can be tricky to printf() without issue.
A common use case for cstdio/iostreams/std::format is logging. It's not at all uncommon to have many, many log statements that are rarely executed because they're mainly for debugging and therefore disabled most of the time. There you go, lots of rarely used and even more rarely 'tested' formatting constructs.
I don't want things to start blowing up just because I enabled some logging, so I'm going to do what I can to let the compiler find as many problems as possible at compile time.
So, how about it? I mean, I have code where that works exactly as expected, so I can "know it works" according to you, but I also have code where that blows up immediately, because it's complete nonsense. Which according to you shouldn't happen, but there it is.
I mean yes, I should have been more restrictive in my statement, but I'm sure you notice how we're veering more into the hypothetical / into programming language geek land. I had to look up %hhn because I've never used it.
(Have used %n years ago but noticed it's a fancy and unergonomic way to code anyway. In the few locations where printed characters have to be counted, just consider the return value of the format call!)
And btw. how is this a problem related to missing type checks with varargs? The only problem I see is that we don't know that those pointers are not null / the char-pointer doesn't point to a zero-terminated string. In other words, just the basic C level of type (un)safety.
Most issues with printf could be reported by static analysis, and modern compiler reports them as warnings, which in my book must be immediately converted to errors. All other weird usages should be either banned, or reviewed carefully, but they are very rare.
Also, std::iostream is ridden with bugs as well. Try to print out hex / dec in a consistent way is plain "impossible". Everytime you print an int, you should in fact systematically specify the padding, alignment and mode (hex/dec) otherwise you can't know for sure what you are outputting.
iostream _sucks_, I had to implement an iostream for zlib + DEFLATE in order to play ball with an iostream-based library, and I had to sweat blood and tears in order to make it work right when a simple loop and a bit of templated code would have worked wonders compared to that sheer insanity of `gptr`, `pubsync`, ... The moment you notice that they've methods called "pubXXX" that call a protected "XXX" on the `basic_streambuf` class is the moment your soul leaves your body.
IOStreams is superbad, and thankfully <format> removes half of its usages which were based on spamming `stringstream`s everywhere (stringstream is also very, very bad). They also inspired Java's InputStream nonsense which ruined the existence of an innumerable number of developers in the last 30 years.
Sometimes the print statement is in untested sanity-checking error-case branches that don't have test coverage ("json parsing failed" or whatever). It's pretty annoying when those things throw, and not too uncommon.
Another case in C++ is if the value is templated. You don't always get test coverage for all the types, and a compile error is nice if the thing can't be handled.
"Type coverage" is pretty useful. Not a huge deal here I agree, but nice nonetheless.
Take as an example printf("%d\n", foo->x);. Assuming it compiled but assuming no further context, what could break here at run-time? foo could be NULL. And the type of foo->x could be not an integer.
Let's assume you run the code once and observe that it works. What can you conclude? 1) foo was not NULL at least one time. Unfortunately, we don't know about all the other times. 2) foo->x is indeed an integer and the printf() format is always going to be fine -- it matches the arguments correctly. It's a bit like a delayed type check.
A lot of code is like that. Furthermore, a lot of that code -- if the structure is good -- will already have been tested after the program has been started up. Or it can be easily tested during development by running it just once.
I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.
I'll even go as far as saying that it's easy to have errors slip on refactors if there aren't good tests in place. But people are writing untyped Python or Javascript programs, sometimes significant ones. Writing in those is like every line was a printf()!
But many people will through great troubles to achieve an abstract goal of type safety, accepting pessimisations on other axes even when it is ultimately a bad tradeoff. People also like to bring up issues like this on HN like it's the end of the world, when it's not nearly as big of an issue most of the time.
Another similar example like that are void pointers as callback context. It is possible to get it wrong, it absolutely happens. But from a pragmatic and ergonomic standpoint I still prefer them to e.g. abstract classes in a lot of cases due to being a good tradeoff when taking all axes into account.
> I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.
A modern compile time type checked formatter would have prevented this mistake, you are deliberately choosing to use poor tools and calling this "pragmatism" because it sounds better than admitting you're bad at this and you don't even want to improve.
In fact C++ even shipped a pair of functions here. There's a compile time type checked formatter std::format, which almost everybody should use almost always (and which is what std::println calls), and there's also a runtime type checked formatter std::vformat, for those few cases where you absolutely can't know the format string until the last moment. That is a hell of a thing, if you need one of those I have to admit nobody else has one today with equal ergonomics.
Thanks for the ad hominem, but let's put that into perspective.
My current project is a GUI prototype based on plain Win32/Direct3D/Direct2D/DirectWrite. It currently clocks in at just under 6 KLOC. These are all the format calls in there (used git grep):
fatal_f("Failed to CreateBuffer(): %lx", err);
fatal_f("Failed to Map() buffer");
fatal_f("Failed to compile shader!");
fatal_f("Failed to CreateBuffer(): %lx", err);
fatal_f("Failed to create blend state");
fatal_f("OOM");
fatal_f("Failed to register Window class");
fatal_f("Failed to CreateWindow()");
fatal_f("%s failed: error code %lx", what, hr);
msg_f("Shader compile error messages: %s", errors->GetBufferPointer());
msg_f("Failed to compile shader but there are no error messages. "
msg_f("HELLO: %d times clicked", count);
msg_f("Click %s", item->name.buf);
msg_f("Init text controller %p", this);
msg_f("DELETE");
msg_f("Refcount is now %d", m_refcount);
msg_f("Refcount is now %d", m_refcount);
vfprintf(stderr, fmt, ap);
fprintf(stderr, "\n");
fprintf(stderr, "FATAL ERROR: ");
vfprintf(stderr, fmt, ap);
fprintf(stderr, "\n");
snprintf(utext, sizeof utext, "Hello %d", ui->update_count);
snprintf(filepath, sizeof filepath, "%s%s",
int r = vsnprintf(m_buffer, sizeof m_buffer, fmt, ap);
int r = vsnprintf(text_store, sizeof text_store, fmt, ap);
snprintf(svg_filepath, sizeof svg_filepath, "%s", filepath);
That's theory and practice for you. The real world is a bit more nuanced.
Meanwhile I have 100 other, more significant problems to worry about than printf type safety. For example, how to get rid of the RAII based refcounting that I introduced but it wasn't exactly an improvement to my architecture.
But thanks for the suggestion to use std::format in that set of cases and std::vformat in these other situations. I'll put those on my stack of C++ features to work through when I have time for things like that. (Let's hope that when I get there, those aren't already superseded by something safer).
I've used `fmt` on *embedded* devices and it was never a performance issue, not even once (it's even arguably _faster_ than printf).
(OT: technically speaking, in C++ you shouldn't call `vfprintf` or other C library functions without prefixing them with `std::`, but that's a crusade I'm bound to lose - albeit `import std` will help a lot)
I noticed std::format and std::print aren't even available with my pretty up-to-date compilers (testing Debian bookworm gcc/clang right now). There is only https://github.com/fmtlib/fmt but it doesn't seem prepackaged for me. Have you actually used std::format_to_n? Did you go through the trouble of downloading it or are you using C++ package managers?
I'm often getting the impression that these "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.
But I'm asking in earnest. Please also check out my benchmark in the sibling thread where I compared stringstream with stdio/snprintf build performance. Would in fact love to compare std::format_to_n, but can't be arsed to put in more time to get it running right now.
> my pretty up-to-date compilers
> testing Debian bookworm
Debian and up to date compilers - pick one. <format> support comes with GCC 13.x, which has been released more than 3 months ago. MSVC has had it for years now, LLVM is still working on it AFAIK (but it works with GCC). `std::print` is a new addition in C++23, which hasn't been released yet.
> Did you go through the trouble of downloading it or are you using C++ package managers?
I don't know of many non-trivial programs in C or C++ that don't rely on third party libraries. The C standard library, in particular, is fairly limited and doesn't come with "batteries included".'
In general I've been using {fmt} for the better part of the last 5 years, and it's trivial to embed in a project (it uses CMake, so it's as simple as adding a single line in a CMakeLists.txt). It has been shipped by most distributions for years now (see https://packages.debian.org/buster/libfmt-dev), for instance, it was already supported in Debian buster, so you can just install it using your package manager and that's it.
{fmt} is also mostly implemented in its header, with a very small shared/static library that goes alongside it. It's one repository I always use in my C++ projects, together with Microsoft's GSL (for `not_null` and `finally`, mostly).
> "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.
No, I think that insecure code is insecure, period, no matter how much it is used or well known. Such focus on practicality over correctness was the reason why my C university professor was so set on continuing using old C string functions which were already well known back then to be a major cause of CVEs. That was, in my opinion, completely wrong.
This is especially true in this case, {fmt}/<format> are nicer to use than `sprintf`, are safer, support custom types and are also _faster_ because they are actually dispatched and verified at compile time. Heck the standard itself basically just merged a subset of {fmt}'s functionality, so much so that I've recently sed-replaced 'fmt' with 'std' in some projects and it built the same with GCC's implementation. `std::print`, too, is just `fmt::print`, no more no less (with a few niceties removed, afaik).
> where I compared stringstream with stdio/snprintf build performance
String Streams (and IOStream in general) are a poorly designed concept, which have been the butt of the joke for years for their terrible performance. This is well known, and I'm honestly aghast any time I see anyone using them in place of {fmt}, which has been the de-facto string format library for C++ for the best part of the last decade (at least since 2018) and is better than `std::stringstream` in every conceivable way.
If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.
Not practical for me to add a second of compile time for each file that wants to print something.
> Debian and up to date compilers - pick one.
gcc 12.3 was released only a few months ago and is included. gcc 13.1, some 80 days old, doesn't seem to have made it. Not everybody is closely tracking upstream. Immediately jumping on each new train is not my thing (hence why Debian is fine), nor is it how software development is handled in the industry generally.
Even on godbolt / gcc 13.1 which I linked in the other post, <format> isn't available. Only {fmt} is available as an optional library.
> {fmrt}/<format> are nicer to use than
I think otherwise, but maybe you enjoy brewing coffee on top of your desktop computer while waiting for the build to finish.
> _faster_ because they are actually dispatched at compile time
I don't actually want this unless I'm bottlenecked by the format string parsing. If I have one or two integer formats in my formatting string, the whole thing will already be bottlenecked by that. So "dispatching at compile time" is typically akin to minimizing the size of a truck, when we should have designed a sports car. The thing about format strings and varags is they're in fact an efficient encoding of what you want to do. Not worth emitting code for 2-5 function calls if a single one is enough.
If there is a speed problem, you need some wider optimization that the compiler can't help you with.
Apart from that, that compile time dispatching doesn't actually happen with fmtlib in the godbolt, not even at -O2. The format string is copied verbatim into the source. Which I like.
> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.
Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?
>> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.
> Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?
Duh, I apologize for not even reading your statement completely. So I went on this page and it is exactly how I imagined. libc printf 0.91s, libfmt 0.74s. 20% speedup is not nothing, but won't help when there is an actual bottleneck. (In this case the general approach has to be changed).
Also compiled size is measurably larger even only with a few print statements in the binary. Compile time is f***** 8 times slower!
These are all numbers coming from their own page -- expect to have slightly different numbers for your own use cases.
For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.
It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream. If you don't like std::string, you can probably write your own ostream that will operate on a fixed size char buffer. (Can any C++ experts comments on that idea?)
About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?
> It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream.
I don't know about those specifically right now, but in general these things have huge compile time costs and are also generally less ergonomic IMO. [EDIT: cobbled together a working version and added it to my test below, see Version 0].
> About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?
Yes. It's a mouthful, and I'm worried not only about static checks but about other things too -- like readability of errors, include & link dependencies, compile performance, amount of compiled code (which is minimal in case of snprintf/varargs)... I would need to check out std::format_to_n() as suggested by the sibling commenter.
And hey -- snprintf has been available for easily 30+ years ... while the <print> and <format> headers that people make such a fuss about, don't even seem available on gcc nor clang on my fully updated debian bookworm system. The reason is that those implementations aren't complete, even though <format> is C++20. The recommended way to get those headers is to grab https://github.com/fmtlib/fmt as an external library... Talk about the level of hype and lack of pragmatism that's going on around here. People are accusing each other for not using a standard library that isn't even implemented in compilers... And with a likelyhood they haven't used the external library themselves, and given that this library is external it's not heavily tested and probably contains bugs still, maybe CRASHES and SECURITY EXPLOITS.
But let me test C++ features that actually exist:
#if VERSION == 0
#include <iostream>
#include <streambuf>
struct membuf: std::streambuf
{
membuf(char *p, size_t size)
{
setp(p, p + size);
}
size_t written() { return pptr() - pbase(); }
};
int main()
{
char buffer[256];
membuf sbuf(buffer, sizeof buffer);
std::ostream out(&sbuf);
out << "Hello " << 42 << "\n";
fwrite(buffer, 1, sbuf.written(), stdout);
return 0;
}
#elif VERSION == 1
#include <sstream>
#include <iostream>
void test(std::stringstream& os)
{
os << "Hello " << 42 << "\n";
}
int main()
{
std::stringstream os;
test(os);
std::cout << os.str();
return 0;
}
#elif VERSION == 2
#include <stdio.h>
int test(char *buffer, int size)
{
int r = snprintf(buffer, size, "Hello %d\n", 42);
return r;
}
int main()
{
char buffer[256];
int len = test(buffer, sizeof buffer);
fwrite(buffer, 1, len, stdout);
return 0;
}
#endif
CT=compile time, LT=link time, TT=total time (CT+LT), PT=preproc time (gcc -E), PL=preprocessor output lines
Bench script:
# put -DVERSION=1 or -DVERSION=2 as cmdline arg
time clang++ -c "$@" -Wall -o test.o test.cpp
time clang++ -Wall -o test test.o
time clang++ "$@" -Wall -E -o test.preprocessed.txt test.cpp
wc -l test.preprocessed.txt
My clang version here is 14.0.6. I measured with g++ 12.2.0 as well and the results were similar (with only 50% of the link time for the snprintf-only version).
For such a trivial file, the difference is ABYSMAL. If we extrapolate to real programs we can assume the difference in build times to be 5-10x longer for a general change in programming style. Wait 10 seconds or wait 1 minute. For a small gain in safety, how much are you willing to lose? And how much do this lost time and resources actually translate to working less on the robustness of the program, leaving more security problems (as well as other problems) in there?
And talking about lost run time performance, that is real too if you're not very careful.
> For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.
Honestly I just don't ensure it perfectly -- beyond running them once as described. I write a lot of code that isn't fully proofed out from the beginning. Exploratory code. A few printfs are really not a concern in there, there are much bigger issues to work out.
I also absolutely do have some printfs that were quickly banged out but that are hidden in branches that have never actually run and might never happen -- they were meant for some condition that I'm not even sure is possible (this happens frequently when checking return values from complicated APIs for example).
The real "problem" isn't that there is a possibly wrong printf in that branch, but that the branch was never tested, and is likely to contain other, much worse bugs.
But the fact that the branch was never run also means I don't care as much about it, pragmatically speaking. Likely there is an abort() or similar at the end of the branch anyway. It's always important to put things into perspective like that -- which is something that seems often missing from C++ and similar cultures.
The more proofed out some code gets, the more scrutiny it should undergo obviously.
Apart from that, compilers do check printfs, and I usually get a warning/error when I made a mistake. But I might not get one if I write my own formatting wrappers and am too lazy to explicitly enable the checking.
Again, you're mixing functions up. `std::print` is the equivalent to "std::fprintf", the one you want to write on random buffers is `std::format_to_n`, which IS a strictly better version of `snprintf`.
I'm using C/C++, which do provide a good level of type safety.
And no, types are absolutely not my problem. In fact, rigid type systems are a frequent source of practical problems, and they often shift the game to one where you're solving type puzzles -- instead of working on the actual functionality.
Come on, dude. C provides almost no type safety whatsoever. There’s no point in saying “C/C++” here, because you won’t adopt the C++ solutions for making your code actually typesafe.
Type safety is not a standardized term, and it is not binary. Being black and white about things is almost always bad. One needs to weigh and balance a large number of different concerns.
A lot of "modern C++" is terrible, terrible code precisely because of failing to find a balance.
for basic things, sure. it is much much worse than this when you deal with different encodings for an application that needs to format and print things
There are widely used languages that don't have a standard at all; literally every single thing you can do in them is above and beyond any sort of standard.
I mean, the function for straight printing is puts; I don't know why people keep using the much more complicated printf in cases where no formatting is involved.
Edit: OK, I guess puts includes a newline, so you'd need to use fputs if you don't want that (although this example includes one). Still, both of those are much less complicated than printf!
Consistency. Having intermixed puts and printfs throughout the code looks pretty bad. Also, every compiler replaces printf of a literal ending with \n with a puts anyway.
It is a very natural feature. Especially when you are writing mathematical code e.g. implementing different types of numbers, e.g. automatic differentiation, interval arithmeic, big ints, etc.
Overloading gives user defined types the expressiveness of internal types. Like all features, of they are used badly (e.g when + is overloaded to an operation which can hardly be interpreted as addition) it makes things worse. But you can write bad code in any language, using any methodology.
It is a very natural feature, but it makes discovering what you can and can't do with a library really hard. Learning what is and isn't legal with math libraries that use a lot of them can be really tricky. For example, numpy code is really easy to read, which is fantastic, but figuring out how you're intended to do things from the documentation alone is quite difficult.
In my experience numpy has also been on of the worst numerics libraries to deal with. The main reason is that Python seems designed to be hostile to numerics. Loose typing, assumptive conversions, specific numeric types hard to access, tedious array notations, etc. all are bad preconditions for a language which sadly seems to have become the prototyping standard in that area.
The moment you have a language actually designed for numerics all these things vanish. One of Julias core design aspects is multiple dispatch, including operator overloading and it works extremely well there.
I also don't see the point for discoverability at all. The documentation will list the overloads and the non-overloaded calls are exactly as discoverable as the others.
As someone who has written math libraries over and over again for the last 25 years (I wish I was joking, but it turns out it is something I'm good at [1]), I find that operator overloading works only for the simple cases but that for performance and clarify, function names work best.
Function names let you clarify that it is an outside product or inside product (e.g. there are often different types of adds, multiplies, divides), and I can not stand when someone maps cross product onto ^ (because you can both exponent and cross product some vectors, like quaternions, so why use exponent operator for cross?) or dot product onto something else that doesn't make any sense. Also operator overloading often doesn't make clear memory management, rather it relies on making new objects constantly, whereas with explicit functions, you can pass in an additional parameter that will take the result. Lastly, explicit functions allow you to pass additional information like how to handle various conditions, like non-invertible, divide by zeros, etc.
I find word-based functions more verbose but significant less error prone and also they are more performant (because of the full control over new object creation.) Operator overloading is only good for very simple code and even then people always push it too far so that I cannot understand it.
> rather it relies on making new objects constantly, whereas with explicit functions, you can pass in an additional parameter that will take the result.
It's not the same if you need to allocate memory for the result. If you could pass the result in by reference, then you could (re)use a buffer which has already been allocated. The difference is massive in things like matrix calculations or image processing where you have an inner loop or a real-time stream repeating similar calculations.
Or you are working with a language like JavaScript where math primitives are GC objects and thus quite costly. In those languages if you do not reduce object creation via reuse in this way, it can be very slow.
Perhaps you're arguing that you ought to be able to name new operators (like Haskell) so that you can create a new operator for inner product instead of having to use '^' (typically used for exp or xor).
Alternatively, the main reason to use operators here is infix notation, so perhaps Haskell-like backticks.
I think languages like Julia make a strong case the other way. you can literally write algorithms that match the pseudocode in a paper. You have to be ok with unicode in your source file, but for numeric stuff, I think its a nice feature
Yeah it's a tiny bit clumsier, and prefix notation takes some getting used to. But on the plus side we avoid all the too-clever travesties programmers have inflicted on us with bad operator overloading decisions! On the whole I think it's easily worth the trade.
Again, no thanks. I want mathematical notation and I simply won't use any language without operator overloading. Free functions for common mathematical operations are an abomination.
Then you should probably use a language that lets you write DSLs for any given domain, rather than abusing operator overloading which just happens to work for a few subdomains of mathematics (e.g., you can't use mathematical conventions for dot product multiplication in C++). Anyway, I've never seen any bugs because someone misunderstood what a `mul()` function does, but I've definitely seen bugs because they didn't know that an operator was overloaded (spooky action at a distance vibes).
Actually, I'm quite happy what C++ has to offer :)
Yes, the * operator can be ambiguous in the context of classic vector math (although that is just a matter of documentation), but not so much with SIMD vectors, audio vectors, etc.
Again:
a) vec4 = (vec1 - vec2) * 0.5 + vec3 * 0.3;
or
b) vec4 = plus(mul(minus(vec1, vec2), 0.5), mul(vec3, 0.3));
Which one is more readable? That's pretty much the perfect use case for operator overloading.
Regarding the * operator, I think glm got it right: * is element-wise multiplication, making it consistent with the +,-,/ operators; dot-product and cross-product are done with dedicated free functions (glm::dot and glm::cross).
One never writes such expression in a serious code. Even with move semantic and lazy evaluation proxies it is hard to avoid unnecessary copies. Explicit temporaries make code mode readable and performant:
auto t = minus(vec1, vec2);
mul_by(t, 0.5/0.3);
add(t, vec3);
mul_by(t, 0.3);
v4 = std::move(t);
I think there may be a misunderstanding here regarding the use case. If the vectors are large and allocated on the heap/on an accelerator, then yes, writing out explicit temporaries may be faster. Of course, this does not preclude operator overloading at all: You could write the same code as auto t = vec1 - vec2; t *= 0.5/0.3; t += vec3; t *= 0.3;
However, if the operands are small (e.g. 2/3/4 element vectors are very common), then "unnecessary copies" or move semantics don't come into play at all. These are value types and the compiler would boil them down to the same assembly as the code you post above. Many modern C++ codebases in scientific computing, rendering, or the game industry make use of vector classes with operator overloading, with no performance drawbacks whatsoever; however, code is much more readable, as it matches actual mathematical notation.
> Many modern C++ codebases in scientific computing, rendering, or the game industry make use of vector classes with operator overloading, with no performance drawbacks whatsoever
I guess these people are all not writing "serious code" :-p
TIL Box2D must not be serious code because it doesn't use copious amounts of explicit temporaries[0].
And just for the record, I'm very glad Erin Catto decided to use operator overloading in his code. It made it much easier for me to read and understand what the code was doing as opposed to it being overly verbose and noisy.
> One never writes such expression in a serious code.
Oh please, because you know exactly which kind of code I write? I'm pretty sure that with glm::vec3 the compiler can optimize this just fine. Also, "vec" could really be anything, it is just a placeholder.
That being said, if you need to break up your statements, you can do so with operators:
auto t = vec1 - vec2;
t *= 0.5/0.3;
t += vec3;
t *= 0.3;
Personally, I find this much more readable. But hey, apparently there are people who really prefer free functions. I accept that.
Of course, the compiler or an advanced IDE can know what your code means. If all your identifiers were random permutations of l and I: lIllI1lI, your IDE would not mind either, but the code would be horrific, don't you agree? The point of the OP is that overloaded operators (and functions) make it harder to reason about the code for a human that reads it. At least for some people. At the end, everything is "just" syntactic sugar, but it makes a significant difference.
Exactly. If you don't care that the code is unreadable and you can rely on every human viewing the code through an IDE with symbol resolution (and not say, online code review platforms) and remembering to use said symbol resolution to check every operator, then operator overloading is great!
If editors were to implement it, you could navigate to the corresponding overload implementation or even provide some hint text. Just like they do for other functions.
Yeah, we would need editors and code review tools to not only follow overloads to their functions but also highlight that the operator is overloaded in the first place. Of course, this is quite a lot more work than just not overloading things in the first place (particularly since the benefit of operator overloading is negligible).
Dealing with money is important, even if it's only a small part of mathematics. I'll focus on that.
Python's 'decimal' module uses overloaded operators so you can do things like:
from decimal import Decimal as D
tax_rate = D('0.0765')
subtotal = 0
for item in purchase:
subtotal += item.price * item.count # assume price is a Decimal
taxes = (subtotal * tax_rate).quantize(D('0.00'))
total = subtotal + taxes
Plus, there's support for different rounding modes and precision. In Python's case, something like "a / b" will look to a thread-specific context which specifies the appropriate settings:
>>> import decimal
>>> from decimal import localcontext, Decimal as D
>>> D(1) / D(8)
Decimal('0.125')
>>> with localcontext(prec=2):
... D(1) / D(8)
...
Decimal('0.12')
>>> with localcontext(prec=2, rounding=decimal.ROUND_CEILING):
... D(1) / D(8)
...
Decimal('0.13')
Laws can specify which settings to use, for examples, https://www.law.cornell.edu/cfr/text/40/1065.20 includes "Use the following rounding convention, which is consistent with ASTM E29 and NIST SP 811",
(1) If the first (left-most) digit to be removed is less than five, remove all the appropriate digits without changing the digits that remain. For example, 3.141593 rounded to the second decimal place is 3.14.
(2) If the first digit to be removed is greater than five, remove all the appropriate digits and increase the lowest-value remaining digit by one. For example, 3.141593 rounded to the fourth decimal place is 3.1416.
... (I've left out some lines)
(3) Divide the result in paragraph (a)(2) of this section by 5.5, and round
down to three decimal places to compute the fuel cost adjustment factor;
(4) Add the result in paragraph (a)(3) of this section to $1.91;
(5) Divide the result in paragraph (a)(4) of this section by 480;
(6) Round the result in paragraph (a)(5) of this section down to five decimal
places to compute the mileage rate.
There's probably laws which require multiple and different rounding modes in the calculation.
This means simply doing all of the calculations in scaled bigints or as fractions won't really work.
Now of course, you could indeed handle all of this with prefix functions and with explicit context in the function call, but it's going to be more verbose, and obscure the calculation you want to do. I mean, it's not seriously worse. Compare:
But it is worse. I also originally made a typo in the function-based API for line5 where I used "decimal_add" instead of "decimal_div" - the symbols "/" and "+" stand out more, and are less likely to be copy&pasted/auto-completed incorrectly.
If overloaded parameters - "spooky action at a distance vibes" - also aren't allowed, then this becomes more rather more complicated.
Why have operators at all? If that notation is good enough, then you might as well use it for the built-in types too. We're halfway to designing a Lisp!
Sorry but please don't take Eigen (https://eigen.tuxfamily.org) away from me. Can't speak for others, but the scientific code I work on would become unreadable like that.
You read the code. And unlike operator overloading, you know at a glance exactly which implementation to look at. There is no spooky action at a distance.
and to know which `plus` is being dispatched, you need to know the types of both arguments, exactly the same as if `plus` is named `__add__` in python or `operator+` in C++.
They're functions. Whatever they do, my code will go execute some code from whatever library implements them, which is what a function does. I just want to be able to rely on [] being an array subscript when I read some unfamiliar code. Is that too much to ask?
Can IDE's detect this and offer "go to implementation" on an overloaded operator these days? Because besides from the surprise-element in the fact that there even is code to debug hiding somewhere, not being able to quickly navigate to it is much worse in my opinion. And with infix operators where you can't even be sure which operand the implementation belongs to, figuring it out can be a bit of a detective task.
Yes (I’m a numerical analysis researcher and wrote a handful of ubiquitous mathematical packages)! The implementation of even primitive types can vary considerably! It’s way too much to hide. Nevertheless, I understand I’m biased. :)
In engineering practice, we often start using math without first consulting numerical analysts. It takes a long time to identify and fix the inevitable issues, which eventually becomes a lesson we have to teach students and practicing engineers because the field has accumulated so much historical baggage from doing it the wrong way.
As an example, early device models for circuit simulation were not designed to be numerically differentiable, leading to serious numerical artifacts and performance issues. Now we have courses dedicated to designing such models, and numerical analysis is used and emphasized throughout.
Is there anything today that you look at and think "yeah, they're gonna need to fix that at some point"?
Vectors and perhaps matrices are about the only valid use case I have ever come across, so I agree with GP that it's not worth it. And that's speaking as someone who once implemented a path class with a subtraction operator that would return the relative path between two absolute ones. I thought I was very clever. I feel sorry for the developers who had to figure out what that was about after I left..
Ah, matrices! Does * mean dot product, cross product, or the usually less useful matrix multiplication? Ooh, or maybe you should you use . for dot! How clever that would be!
> And that's speaking as someone who once implemented a path class with a subtraction operator that would return the relative path between two absolute ones. I thought I was very clever.
Haha! It's ok. The temptation to be clever with operators is too strong, few can resist before getting burned (or more usually, burning others!) at least once.
> Ah, matrices! Does * mean dot product, cross product, or the usually less useful matrix multiplication? Ooh, or maybe you should you use . for dot! How clever that would be!
Why the snark? The fact that you're free to make a bad choice does not imply that having a free choice must be bad. Obviously neither dot nor cross product should be *. It should be the Hadamard product or matrix multiplication. You can choose one convention for your code and be perfectly happy for it.
As a follow-up question: How do you feel about languages like Fortran and Matlab then? Is it actually a good thing that mathematics convenience features are relegated to a few math-oriented languages and kept away from all the others? (Or are the linear algebra types in these languages offensive as well?)
The benefits from operator overloading are "I can show this to a mathematician and it looks like what they're used to". The downsides lurk in the corners of whether it's actually doing what you think.
In C++34 we'll finally have a way of overloading the empty string operator, so that we can, at last, write AB for the product of matrices A and B. As God intended.
Overloading is orthogonal to the issue you're striking at: infix operators versus postfix function calls. Functions can be overloaded just like operators.
What if you could type the asterisk to multiply vectors, but then your editor of choice would replace it with the symbol that actually means vector multiplication?
Perhaps that idea falls apart once you realize you would need hundreds of symbols for just addition…
But what if those symbols were (automatically) imported and named at the beginning?
Perhaps it would be annoyingly inconsistent how in various files different symbols are used for the same operation…
The idea is, operator overloading is a convenience feature. Why not have that convenience as an option in an editor, without influencing the language? If you want scalar multiplication to look the same as vector multiplication, set it in your editor. If you want to insert scalar multiplication with the same key you insert vector multiplication, set it in the editor (to figure out which you mean, based on context, when you press that key).
Just to be clear, I'm not being a smartass, just considering this as an option and wondering if the HN crowd has some thoughts on this.
That said, in my experience over the decades, operator overloading has been one of the primary causes of bugs that are very hard to pin down, so I have come to hate it. It hides far too much.
The cost/benefit ratio of operator overloading is generally unfavorable in practice, in my experience. Which is not to say it shouldn't be used when it actually clarifies things! But those situations tend to be fairly niche.
Interestingly, where I work right now, using operator overloading is specifically prohibited. So I'm confident that my dislike of the practice is not just a personal quirk.
That't literally the only place where operator overloading makes any sense.
In other places they monkey patch c++ defincies as a language.
And they are confusing and error prone.
Nobody is pretending we will get rid of any c++ syntax ever. So the discussion is about a hypothetical language syntax that fits C++ slot.
In that world C++ would have N x M matrices as native value types in the language (as fortran does) and those operators would be defined in the language spec for matrix types just as they are defined for standard number types at the moment.
Making such assumptions about what is correct or proper use is why c++ is so successful, it doesnt make assumptions it leaves it up to the project / community using it.
Go ahead make a language that dictates alot and makes srict assumptions it will be depricated or forced to open up before the end of the decade.
Notr this is why i think python and lisp is so populare meta programing is very powerful and expressive.
The fact C++ has so many ways of defining things is not the reason it's so popular. The reason is the enormous industry investment on the language tooling and ecosystem. IMO the language itself is the worst part of the ecosystem, but the other parts create a totality that is the best development language ecosystem in industry for my niche (graphics and geometry) including libraries, copmpilers, debugger & profiling etc.
Any language with the level of industrial support C++ has had would have grown to prominence. C++ came abut a judicious time in history when "object orientation" was becoming the latest buzzword. And now we have ended up with gazillions of lines of C++ code.
It's a tragedy of our trade that two mongrels - C++ and Javascript - became to be among the most prominent in our trade.
But the reason C++ fits in so many industries from embedded system to high level gui libraries is its flexibility we see the end of OOP trend but C++ does bot lock its users into one paradigm or another so it will continue to be industry standard. Even if the industry is moving towards other paradigms of programing.
Adding, javascript really inly has one industry its used in. Think it sais a bit about its versatility
That's a good observation about C++ not making assumptions, it strikes me as true. C++ apparently doesn't even make assumptions about what the C++ filename extension is. .h, .hh, .hpp, .hxx, .C, .cc, .cpp, .cxx, .ixx, cppm
> That't literally the only place where operator overloading makes any sense.
That may be true for C++ (I'll take your word for it), but not for all programming languages in general. For example, in C# it's fairly common to overload == and != to implement value equality for reference types (classes).
Of course, you should really only do this for immutable classes that are mostly just records of plain old data. And C# 9 introduced record classes, which is a more convenient way of defining such classes. But record classes still overload these operators themselves, so you don't have to do it manually.
Honestly that sort of thing always confused me when I worked in Java, C#, etc. I could never tell at a glance whether the operator was doing an identity comparison or a value comparison, and I definitely contributed a few bugs from this misunderstanding. In Go which lacks operator overloading, we either `ptr1 == ptr2` or we do some `ptr1.Equals(ptr2)` for value comparisons and `ptr1 == ptr2` for pointer comparisons--in either case, there is no ambiguity and IME fewer bugs.
Java's the same regarding == and .equals( ) and when it's Java code written by devs who also work in other languages, it definitely still results in bugs, sometimes that go undiscovered for remarkably long times (particularly if == happens to return the right result in most cases). Meaning/needing to compare references for string (and similar) types is exceedingly uncommon, yet uses the more "natural" syntax for testing equality.
FWIW I can't remember working with a codebase where unexpected behavior due to operator overloading was a serious problem.
Operator overloading is a useful feature that saves a bunch of time and makes code way more readable.
You can quibble whether operator<<() is a good idea on streams and perhaps C++ takes the concept too far with operator,() but the basic idea makes a lot of sense.
string("hello ") + string("world");
complexNumber2 * complexNumber2;
for (int i : std::views::iota(0, 6)
| std::views::filter(even)
| std::views::transform(square))
someSmartPtr->methodOnWrapperClass();
The majority of time in professional codebases is not spent on typing but reading and understanding code.
"saves a bunch of time and makes code way more readable"
Not when everybody defines their own operators.
Note - we are discussing operator overloading, not operators as features in syntax. Operators at the syntax level make life a lot easier. But then everybody uses the exactly same operator semantics, not some weird per-project abstraction.
The lines of code you wrote as an example are not saving anyones time, except when writing it if you are a slow typist and lack a proper IDE support for C++. If typing speed is an issue, get a better IDE, don't write terser code.
Code is read more often than written. Writing code that can be understood at a glance (by using common, well understood operators) optimizes for readability.
I think your argument is basically "people should not aggressively violate the implicit bonds of interfaces", which is true. But that goes for all interfaces, not just and not in particular those around operators.
We just have cases where it's common with operators because those are one of the few cases where we have lots of things that meet the interface and interact directly as opposed to hierarchically. The same kind of issue comes up with co/contravariant types and containers sometimes, but that's less often visible to end developers.
I tend to agree with this. I like operator overloading for mathematical constructs (like complex numbers or even just for conversions of literal types, Imagine, for example, you have a gram type and a second type, if you said 1g / 1s you'd get 1gps, that seems reasonable)
I don't like it in the example given
for (int i : std::views::iota(0, 6)
| std::views::filter(even)
| std::views::transform(square))
What benefit does this have over the Javay/Rusty version that looks like this
for (int i : std::views::iota(0, 6)
.filter(even)
.transform(square))
?
No deducing what `|` means, you are applying the filter then transform function against the view.
People don't use the same operator semantics. Is + commutative? Does it raise exceptions or sometimes return sentinels? What type conversions might == do?
And how exactly do you propose library authors should work with user-defined types? Operator overloading is what allows algorithms to be efficiently generalized across types.
The code isn't readable (you can't even reliably tell at a glance what the operator does) and it takes negligibly longer to write "add()" rather than "+" in your program (yes, 'add()' is more keystrokes and thus takes longer to type, but most of your program isn't addition instructions).
I think what people should advocate is full DSL capabilities with some unambiguous gate syntax so people know precisely that `foo * bar` is not using the host language syntax. Overloading operators is ambiguous and vastly incomplete (everyone is holding up matrix math as the shining example for the utility of operator overloading and you can't even express dot product notation in C++!)--it's a hack at best.
> The code isn't readable (you can't even reliably tell at a glance what the operator does) and it takes negligibly longer to write "add()" rather than "+" in your program (yes, 'add()' is more keystrokes and thus takes longer to type, but most of your program isn't addition instructions).
Except now you replaced + with a name that tells you just as much/little as + does. So you made your program verbose for the sake of verbosity.
No, you’ve made your program “verbose” (by a handful of characters) for the sake of clarity—there is no longer ambiguity about what code runs (of course, this assumes you aren’t similarly overloading named functions, which should also be disallowed).
That was me, but I didn’t provide example code that would require namespaces. I don’t understand his your earlier comment makes sense in the context of this thread.
Unfortunately, like spicy peppers, everyone's definition of "too much" is different. Some people are eating ghost chili peppers just fine while others are struggling with ketchup.
All the things you wrote could be about as easily written & much more easily read without operator overloading. Operator overloading only allows programmers to feel "smart" for doing a "clever" thing, to the detriment of future readers.
string("hello ").append("world");
complexNumber2.mult(complexNumber2);
// wtf is even going on with this one in your example? have these people never heard of method chaining?
for(int i : std::views::iota(0,6).filter(even).transform(square))
(*smartPtr)->methodOnWrapperClass();
That's all about the same verbosity, it's much more clear to the reader even if they're unfamiliar with your codebase, and dropping operator overloading eliminates the "clever" option to do stupid crap like divide file path objects together.
Would you advocate getting rid of operators altogether?
3.times(2).plus(7)
Some things just lend themselves to being expressed in terms of simple operators.
(*smartPtr)->methodOnWrapperClass();
That is still using the overloaded SmartPtr<>::operator*() method.
I understand the viewpoint that operator overload is syntactic sugar for things that can easily be done another way, I just disagree that costs outweigh the benefits.
> Would you advocate getting rid of operators altogether?
Of course not. It makes sense for built-in types, as everyone reading the code can be assumed to know them.
> That is still using the overloaded SmartPtr<>::operator*() method.
Good catch ;)
> I just disagree that costs outweigh the benefits.
Yah, I think that's the disagreement. My feeling is there's a teeny, tiny handful of appropriate places for it (almost entirely math) and it opens up a pandora's box of terrible decisions that programmers clearly find irresistible.
as a good thing or a bad thing? I see a.equals(b) occasionally from the first argument is magic crowd but 3.times is novel here. I'm really unsure what the order of operations is for that expression.
“Fried shrimp should be removed from the all you can eat chinese buffay because i cant help myself from eating at least 20 of them in a single sitting and now i have stomach cramp”
The very first Hello World program anyone learning C++ will write uses the godawful iostream bitshifting operators! Not even the language's authors could help themselves eating 20 fried shrimp on the first day the buffet was open!
The iostream bitshift overload was one of the first features of C++ that I learned to despise. I'm very happy that there's an alternative in the new version.
How far are you prepared to take this stance, exactly? C has operators that are generic over both integral and floating point types. Was that a mistake? Did OCaml do it better?
For my part, I've been persuaded that generic operators like that are a net win for math-heavy code, especially vector and matrix math. Sure, C++ goes too far, but there are middle grounds that don't.
Having operators defined for value types within the langauge spec is different thing than defining operator overloading for arbitrary struct and class types.
For numeric value types mathematical operators are the only sane option.
For arbitrary classes - not so much.
A sane language in the slot of C++ in the language ecosystem would not have operator overloading. It would have matrix types defined in the language spec with mathematical operators operating on them.
One part of the philosophy of the language maintainers is that they're somewhat humble about their designs in the standard library, and very much against breaking changes.
Some folks prefer absl's flat_hash_map over std::unordered_map for a hash table, and it's not great that you need to choose or risk having both in a codebase, but it _is_ nice that you can have your preferred hash table and use operator[] whichever you decide.
Python also has operator overloading, and people seem to like that numpy can exist using it. And container types. Weirdly doesn't cause much consternation compared to C++ (maybe because the criticisms of the latter come from C programmers?)
I've occasionally missed overloading in JS/TS though.
> It would have matrix types defined in the language spec with mathematical operators operating on them.
This is unfortunately impossible (IMO). The problem is matrixes have multiple operations that don't translate nicely like complex numbers do. If you want to be consistent, you have to pick and choose What A * B means, under which contexts, and when is that illegal (or what should happen on an error).
For complex numbers, there's only one definition of A * B that matters and no failure cases.
I fear there's not clean way to do matrix operations that won't make some community really irritated for choosing "wrong". (Physics, engineering, science, etc.)
Operator overloading is critical for building ergonomic frameworks.
The modern web is built on overloading the . operator (e.g. ORMs like Rails and Django). We will never see a Tier-1 ORM in Golang simply because it lacks it.
> Operator overloading is critical for building ergonomic frameworks.
The modern web is built on overloading the . operator (e.g. ORMs like Rails and Django). We will never see a Tier-1 ORM in Golang simply because it lacks it.
As I said, there won't be a Tier-1 ORM in Go. Ent or Gorm are tier-2 at best. They can get the job done, but it ain't pretty.
Any advantages of Go (and there are many) are outweighed by the fact that you have to write and read 2x more code to be equally productive as Rails or Django.
That sounds like a good thing, having dealt with Hibernate in production. As a backend developer, I'm pretty happy with C++17 (and beyond), Go and Rust. All of them can be used in fairly explicit ways, which means debugging a problem is easy, and performance issues are right there on the page if any. I want less magic, not more.
Magic is magic until it becomes understood, then it is science.
While I don't want junior programmers wielding the dark magic of operator overloading, I trust that the engineers behind Django are using it reasonably.
I'll byte: complex numbers and matrix support is bad in languages without operator overloading. Why should only the primitive types of the language be privileged to proper math notation?
Not having operator overloading is anti-human. To think so highly of yourself that there is no other thing that can properly be the subject of the field operators (or other basic operators) is the height of hubris. The compiler typically must handle the operators on certain types due to the compilation target's semantics, but in reality, there's nothing special about these 'built-ins'.
Operators like +, -, /, *, etc have meanings independent of integers and floats and to not allow these meanings to be expressed is sad.
I've heard many programmers express this sentiment and what they actually are attempting to argue is that having overloads of these operators that do not respect the corresponding group, ring, or field laws is confusing. This I agree with. Operators should mainly follow the proper abstract semantics.
BS. I thought that Java already demonstrated to the world how dumb it is to disallow operator overloading altogether.
Allowing ANY operator to be overloaded was dumb, like C++ did, where you could do batshit crazy stuff like overloading unary & (addressof) or the comma operator (!), or stuff like the assignment operator (that actually opens a parenthesis about how copy/move semantics in C++ are a total hack that completely goes OT compared to this).
Sensible operator overloading makes a lot of sense, especially when combined with traits that clearly define what operations are supported and disallow arbitrary code to define new operators on existing types. Rust does precisely that, and IMHO it works great and provides a much nicer experience than Java's verbose mess of method chaining.
I'm on your side, but only after many years of being on the other side. I used to think they were "graceful" and "minimalist", and refused to acknowledge they can be the source of many surprises.
The Google C++ style guide has a very nice overview. There are only two pros listed, and large number of cons. And this document is old by Internet (dog) years -- at least 10 years.
Consider the humble + operator. In most compiled languages -- even those that don't support operator overloading -- it is in fact overloaded. int + int. long + long. float + float. double + double. pointer + int. Would every language be better with it?
Built in operators don't always map 1-1 to CPU instructions so don't appeal to that authority. There are still plenty of CPUs -- old and new -- without multiplication, division, or floating point support.
You could argue that there is just one type (tensor) with some invalid operations between its values (e.g., when dimensions mismatch). Just like integer division by zero.
I disagree, it’s heavily abused but very useful for types where it’s obvious what the operation is (inherently mathematical types like vectors and matrices). I wrote a macro library for C that vector/matrix math in prefix notation with _Generic overloads and it’s still too clumsy to get used to.
> where it’s obvious what the operation is (inherently mathematical types like vectors and matrices)
Considering there are like 3 different types of matrix multiplication operations, I don't think it's obvious at all. Feels like you should either use a language with complete support for implementing custom DSLs (that can express the whole domain naturally) or eschew ambiguous operator overloading altogether (gaining consistency and quality at the expense of a few keystrokes).
I think we all know what someone means when they say “matrix multiplication”. Asserting that * could mean, say, the Hadamard product or the tensor product is a reach. In practice I have never seen it mean anything else for matrices.
DSLs just push the complexity away from the language into someone else’s problem in a way that has much higher sum complexity. You’re making authors of numerical libraries second-class citizens by doing so. For some languages that’s probably not a bad choice (Go is one example where I don’t feel the language is targeted at such use cases).
Also, the lack of a standard interface for things like addition, multiplication, etc. means that mathematical code becomes less composable between different libraries. Unless everyone standardizes on the same DSL, but I find this an unlikely proposition, given that DSLs are far more opinionated than mere operator overloads.
I've never understood why people complain a lot about `std::cout << "string"`, if the problem is that this operator is used for bit shifting, simply stop thinking that way (genius I know), do you think of addition when you see `string + "concatenate"`? Operator overloading is awesome, and like everything in programming, if used correctly; constructing paths with / is sweet, and I find << with streams visually appealing and expressive, it's feeding data to the stdout/file/etc, same for `std::cin >> var`, data goes from the stdin to the variable.
> do you think of addition when you see `string + "concatenate"`
Yes. And it tortures me every time.
I religiously avoid string concatenation in Python for this very reason. It's not that "+" necessarily means addition; it's that it always means a commutative operation (to somebody who has learned some algebra). String concatenation is notoriously non-commutative, thus it is extremely disturbing to write it using a visibly commutative operator. Any other operator except "+" would be better. For example, a space, or a product, or a hyphen. Whatever. But please, not a commutative operator. It breaks my brain parser.
It's also one of the biggest sources of bugs when dealing with loose types around strings and numbers.
When it comes to languages that let you mix strings and numbers, Lua has it right. + always adds, and accepts numbers and strings that can cleanly convert to numbers. .. always concatenates, and accepts numbers and strings.
Aside from the syntax (I find it ugly, you find it visually appealing - it's subjective), iostreams are inefficient, awkward to customise, not thread safe (allows interleaving), and mix format and data (that one is also subjective).
I love me some operator overloading. I love / for filesystem separators, I love | for piping things. I don't like << and >> so much but that's just because of too many years of writing them everywhere.
In C++, with its templates, there are only a couple alternatives:
1. Operator overloading
2. Operator desugaring (e.g. __subscript__(), which substitutes the intrinsic function for basic types, but can also be defined for user defined types)
3. Writing templates with weird adaptors for primitive types.
Given that its design goal was to embed C, there were already operators that worked with various and mixed types. Adding (+.), etc., would have been unacceptable to the users. So, I think in general, for this language, it was good but, unfortunately, iostream made people think you should overload the behavioral expectation, too.
Its design goal was to repeat the author's experience having to downgrade himself from Simula to BCPL, he wasn't keen in repeating the same experience with C, when he went out to write distributed computing applications at Bell Labs.
Bjarne has given a couple of interviews on the matter.
The alternative GP is advocating for is "none of the above." Meaning "operators are only defined for primitive types" which is perfectly fine when working with C.
If you wanted to use an abstraction over non primitive types for such things you would use a normal function.
Built in types are special. The compiler needs to define operator precedence and certain semantics (commutativity, associativity, overflow, etc) that cannot be expressed in the type system or enforced by the interface.
The fact that it makes syntax "nicer" for user defined types is at best subjective and at worst an anti-pattern because it leads to bad compiled code and confused programmers. Function calls are unambiguous and follow the same rules as other function calls, while operator overloading does not.
Whatever you save in avoiding having to write "sum = add(l, r)" is not worth allowing programmers to bitshift file handles[1], divide file path objects[2], or subscript objects to launch a process[3].
C++ "bitshifts" (yes, but it's just a symbol in this context) make no real difference to me. It was type safe when it wasn't easy to be back in the day, everyone knows exactly how they work - it's never shot me in the foot (whereas the real meat of the design has, independently of how the composition is expressed syntactically).
The others are more sus but you can make a bad API out of anything.
Rewriting (say) a load of calculations as a tree of sum pow exp etc. is just a huge burden - a codebase I work on has a formula that takes up about 6 lines for example, outputted by mathematica: total pain to to translate to function calls.
> The others are more sus but you can make a bad API out of anything.
Yahhhh but there's something about operator overloading that is like catnip to clever programmers. Ah ha, file paths have slashes, and the divide operator is a slash! Clearly I should use the divide operator to append paths segments together! I'm so clever! Ah ha, << looks kinda like an arrow I guess, so we can use it to, uh, pass data between objects, I guess. I'm so clever...?
It's irresistible. We have abused it and we must give it up for the good of all.
Operator overloading has been a feature of many languages dating back to the very concept of using operator notation in programming. I know of no language that has the * operator dedicated solely to a single type. Typically you have at least signed and unsigned overload, as well as various bit sizes (including larger than the machine word size), and floating point representations. Extending that to vectorial operations, arbitrary precision, and others only seem to make sense and to be going with the flow...
Most programming languages use infix notation for mathematical operations but polish notation for function calls. This creates an inconsistency. In languages, like LISP, that entirely use polish notation the inconsistency does not exist.
One could argue that if a programming language has this inconsistency, then one should at least try to be consistent with one's notation, i.e. for mathematical operations use infix notation (operator overloading).
Agree. It looks fun when I am writing the code and re-inventing abstract algebra and category theory types for classifying cat pictures. However then at some point I have to read someone else's code, even my own code weeks later and then I start cursing operator overloading.
Operator overloading is one of the cornerstones of generic programming in C++. And perhaps it is a failure of imagination on my part, but it’s difficult to think of a more elegant approach.
If you just need a nice print: fmtlib is a really nice c++23 style implementation without needing c++23 compiler support. Highly recommend it. It’s simple. It’s fast.
I think Barry under-estimates how long it will be before C++ programmers actually get the equivalent of #[derive(Debug)] and as a result the impact on ergonomics. But of course I might be wrong.
This works on my RHEL9-compatible for a .c file (using gcc). The type specifier for main is implicitly `int`. You get some warnings about implicit types and implicit declarations, but you get a binary that when executed writes "Hello, world".
Are there now in c++, after all these years, f-strings like python has, or at least something coming close? If not, I keep being at my disappointed state about c++.
Slightly off topic, but I recently learned that implementing the opposite of what you've asked for, bitshift stdout in python, is only a few lines of code:
If people don’t have time to keep up with a languages updates (which, in case of C++, is currently _once_ every three years), then they don’t have time to complain about the lack of features, either. This one had the time to complain and just didn’t want to bother typing "c++ string formatting", which would have been fewer keystrokes than the comment complaining.
On DuckDuckGo, the very first result for "c++ string formatting" is the exact thing this person was complaining about.
That about wanting to change mind... it touches a string! Somehow the let-downs must have been too much for me at a certain point in time. That being said, I'm curious to find out right now. Edit... no such thing found as string interpolation in cpp, at least not in my first 4 search hits. I'll crawl back.
- template-function parameters (NTTPs with a function parameter syntax rather than a template parameter syntax, tentatively spelled as `void foo(template int constant){}` )
- Scalable reflection, in combination with expansion statements (likely in C++26, spelled `template for (auto& e : array) {}` ) which would allow you to write an arbitrary parser for the template string into real C++ code. Reflection targets C++29.
Syntax2 already supports string interpolation as a special compiler feature.
That type of interpolation is something most non-scripting languages don't have anyway, and it took Python several decades to get it, and only has had it in the 5 years or so.
I should have said the "latest standard", not "spec", if we're being technical. But EVERY bit of official material is very clear about asserting that C++23 is still a preview/in-progress, not a standard. Saying otherwise is, strictly speaking, incorrect.
And quite frankly, what matters to devs is what tooling supports the specification without special configuration, and the answer is "basically none". Not a single compiler fully supports it.
fmt has been available for years and it works with ridiculously old compilers. It’s great to have it standardized but it’s not a new capability that C++ didn’t already have.
My guess is you never had to parachute into a project using operator overloading in strange, inconsistent, and undocumented ways with no original maintainers to show you the ropes
I actually like operator overloading, but overloading the shift operators for I/O was still a mistake IMO. It's a mistake even if you ignore that it's a theoretical misuse (I/O and binary shifting have nothing to do with each other semantically). The operator precedence of the binary shift operators is just wrong for I/O.
First, includes either need to be wrapped in angle brackets (for files included from the include path passed to the compiler) or quotes (for paths relative to the current file).
Second, the whole standard library would be huge to pull in, so it is split into many headers, even for symbols in the top level of the std namespace.
Something I've learned recently is that the convention of when to use angle brackets and when to use quotes is not prescribed by the standard but instead is implementation-defined.
#include is a preprocessor directive that substitutes the text of a file in place. import declares that this translation unit should link to a specified module unit. Usually there would only be a single translation unit for the entire program in the latter case, which obsoletes IPO/LTO (except when you have statically linked libraries), and means internal linkage functions (everything that is template, inline, or constexpr) do not have to be redundantly recompiled. That also means there would be no distinction between inline variables and static variables. This obsoletes unity builds and precompiled headers.
Some C++ person recently wrote to the GNU Make mailing list about some grotesque experiments for supporting C++ modules inside GNU Make whereby GNU Make would communicate with some C++ compiler process over sockets and generate dependencies.
Any decent language with modules needs no make utility in the first place. You tell the compiler to build/update the program. The compiler compiles (if necessary) the interface definitions the program references, and in that vein recursively updates the whole tree, compiling anything that has changed; then links the program.
I didn't need any make utility when working with Modula-2 in 1990!
I’ve always wondered why we use make in the first place. Is it really so hard to write a python script that keeps track of file time stamps in a JSON? gcc can even be invoked with certain flags to print header dependencies. Make is crusty, archaic, and over designed
C++ has namespacing which makes sense because this language has an enormous amount of available 3rd party libraries and without name spacing you can't help stepping on each others toes.
There are two ways you might want to have this work anyway despite namespacing. One option would be that you just import the namespace and get all the stuff in it, this is popular in Java for example, however in C++ this is a bit fraught because while you can import a namespace, you actually get everything from that namespace and all containing namespaces.
Because the C++ standard library defines a huge variety of symbols, if you do this you get almost all of those symbols. Most words you might think of are in fact standard library symbols in C++, std::end, std::move, std::array, and so on. So if you imported all these symbols into your namespace it's easy to accidentally trip yourself, thus it's usual not to do so at all.
Another option would be to have some magic that introduces certain commonly used features, Rust's preludes do this, both generally (having namespaces for the explicit purpose of exporting names you'd often want together) and specifically ("the" standard library prelude for your Edition is automatically injected into all your Rust software by default). C++ could in principle do something like this but it does not.
The language is designed so that is possible, although the current compiler does not. At one point, the compiler did all the file reading in parallel, but that was finally turned off because it did not significantly improve compile speed.
The std namespace is from the <print> standard header. It’s not just print because while you might want it in the global namespace, other people do not. For example, my code isn’t cli and doesn’t need to print to the cli, but perhaps I want to print to a printer or something else and have my own print function.
Leave un-namespaced identifiers to those that are declared in the current file and namespace everything else. If you really want, you’re free to add “using namespace std” or otherwise alias the namespace, but keeping standard library functions out of the global namespace as a default is a good thing! (In any language, not just C++)
> If you really want, you’re free to add “using namespace std”
You're free to, but I discourage the habit. It's more verbose to add the namespace:: prefix to symbols, but it sure does make it easier on the devs that have to work with the code later.
Oh, I’m completely with you on this and always prefix my namespaces. Occasionally I will alias long namespaces to short ones, but I never pull identifiers into the global scope and I really dislike when I see code online that does “using namespace” (unless it’s tightly scoped, at least). I’ve been prefixing std:: for years and won’t stop now, I like knowing where an identifier is coming from, which is extra important when you have multiple types of similar containers that you use for different performance characteristics (eg abseil, folly, immer versions of containers vs std containers)