> And C++ just... doesn't have that many real problems.
It does not have many, but it has one and it's big. It's not memory safe.
Check out this talk by Herb Sutter, around minute 43:
This is the first time a government [see note] has actually issued guidance to industry broadly that was the barest paraphrase away from "where possible don't use C and C++"
[ Note: in this context, the "government" is the US Department of Commerce and NIST, which, according to Herb, have issued a detailed guidance in response to the Executive Order on Cybersecurity [2]. While I couldn't find that guidance, I found one by the NSA [3], which saw quite a few comments here on HN in December last year [4] ].
Modern C++ has made big inroads into removing memory safety issues though. Memory safety bugs are relatively uncommon in modern C++ compared to pre C++11 times.
You could argue that C++ is still "unsafe by default" and you'd be right. But most C++ programmers use smart pointers and unique pointers rather than raw pointers now, vectors instead of raw arrays, etc.. And it does make a big difference.
It's not that people can't deal with pointers. The most common issue in C++ is how many ways to shoot yourself in the foot with common utils. Like, we use a lot of abseil utils at work that return string_views instead of new strings, and if you're directly passing that somewhere else, you easily corrupt your memory.
Yeah, all technically the user's fault, but it gets pretty tiring worrying about this stuff when you're writing a high-level application and the few CPU cycles you save don't really matter. No matter how good you are, you will make mistakes, and the chance of one bug corrupting memory in your whole process increases exponentially with the LoC written. Also I have to point out the irony in a language being type-safe but not memory-safe.
Also, I didn't expect Java-looking class syntax to leave member variables uninitialized (garbage) if you don't set a default. That should at least be a compiler warning, I mean how often do you actually want that?
The garbage comes from AsciiStrToUpper returning a new std::string which StripAsciiWhitespace takes as a string_view (implicit conversion). By the time you print foo2, the string is already freed.
This is a classic misuse of rvalues so I’m not sure it’s a huge problem. I don’t think it’s a problem with C++. I probably work at the same place you do btw and I am pretty sure there are rules/guides about how to handle string_view lifetimes
For those downvoting, please explain why you think string_view should take ownership of an rvalue string
All this talk of rvalues, auto, and string_view is PRECICELY the problem with cpp.
There are so many details you need to keep track of, and eventually you make a mistake. Now, yes, i know what all of that does, and how they are supposed to be used, but it does not remove the cognitive load.
Many languages have a substantially lower cognitive load when doing something trivial, such as ToLower().ToUpper(), etc.
Every systems-level language will have these sharp edges handled differently. Have you heard of borrow checkers? Many languages don’t have to concern programmers with rvalues or equivalents at all because they are garbage collected - should everything be garbage collected?
Let’s not make this a language war, there has to be some programming language that does low-level memory operations without batteries included for some types of technologies. When I write code in C++ and I want something like garbage collection or ref counting I can reach for a shared_ptr.
If you don’t want to concern yourself with types, values vs references, or manual memory management don’t choose C++. The default handling is sane if not necessarily intuitive. You shouldn’t create a ref or pseudo-ref (string_view) to data that is not on the heap and no longer allocated to the stack - seems sane. This problem could be easily caught by breaking function calls into separate lines and explicitly specifying the types at each step.
Is Cpp systems-level language? As I see, the problems arrives from the fact that Cpp is a multi-paradigm language, and as such contains a near endless wealth of features to combine. Also modern Cpp takes a clear direction away from low level to (relatively) more high-level language. The problem really comes from the combination of low level and high level paradigms in the same language/codebase, where the expectations of the behavior of each piece of code can vary wildly.
> ... will have these sharp edges handled differently
Perversely, plain C, often has a lower cognitive load than Cpp. This is because you must always manually handle pointers, types, object lifetimes, etc. Now, this does result in a high cognitive load, BUT, the cases as in Cpp where you might combine auto, with a temporary, with a pointer inside, referring to some unseen resource, cannot happen, because you cant do that in C. So the worst case in cognitive load is never as bad as in Cpp. The lack of features means less possible combinations to shoot yourself in the foot. (Though C does still have many, nor is C an ideal language)
> Let’s not make this a language war
Discussing the problems of Cpp is not language war. Nor is understanding the merits and problems of the Design of Cpp.
> If you don’t want to concern yourself with types, values vs references, or manual memory management don’t choose C++
But IF YOU DO write Cpp, there is no escaping them. Which is my point.
Many languages don’t have to concern programmers with rvalues or equivalents at all because they are garbage collected - should everything be garbage collected?
From Xerox point of view, yes. Unfortunely they lost to the UNIX workstation market.
Believing that one size fits all is a sure way to alienate a lot of people who don't fit that size. Right tool for the right job and giving the freedom for people to solve their problems in the best way is a better way to win people over.
Yeah, you shouldn't have GC in systems-level code any more than I should have lifetime concerns in application-level code. Problem is leaders end up choosing C++ for both and saying y'all should know how to deal with it.
Google also has a lot of stuff in the middle where they're fine with sacrificing a little speed for a lot more safety, but Java etc are too slow. Golang was supposed to satisfy that, but I'm guessing it was too big of a leap from C++.
It's good that you can solve this with 100% of the team following 100% of the time the rules/guides.
It's even better if your language has a way to express "the return value points to data from the input argument, so it's a compile error to pass a rvalue string to this function". The second we got a language able to do that, usable everywhere where C++ is, (yes, that one) the incapacity of C++ to express this became "a problem with C++". Our expectations have just increased.
Surely it can be catched via static analysis if you suppose the common case that the return value is a function of the argument, and not pointing to some static global data. But you will get false positives when somebody does the uncommon case. There is a lack of expressiveness in C++ here.
The "oh shit" moment when we found that our database's indexes got mysteriously corrupted and we had no idea from where. What do we do, fire the entire team to get rid of whoever made the bug?
Last I remember, the lifetime profile stuff was there, but there was still no way to add your own annotations. For some reason, I didn't hear too much about any of this, it was
I have just taken a look... https://wg21.link/p1179 is actually still not there, right?
I see some interesting stuff in https://discourse.llvm.org/t/rfc-lifetime-annotations-for-c/... from last year. But, in general... once Herb started the whole Lifetime safety profile thing, there seemed to be a lot of interest, but for the last few years things seems to have died? I think things are better with Visual Studio, but being on Linux... I may need to start developing for Linux from Visual Studio with WSL2 :-(
When you're writing a ton of web backend code dealing with strings all over the place, it's easy to forget to check whether something is accepting and returning a string or string_view. `auto foo2 = StripAsciiWhitespace(AsciiStrToUpper(foo))` doesn't say what's going on.
I didn't downvote, btw. I don't do that in general.
Heh this is normally why I tell my new teammates to be careful about using auto in non-test code outside of well established patterns like iteration. Auto will bite everybody in the butt at some point!
We don't got time for non-auto. Better yet, we should've just written all our stuff in Python or Java. We have no legitimate need (performance etc) for C++, Rust, or even Golang.
Even if you use string_view on the LHS instead of auto, pretty easy to miss the bug here.
Same here. I avoid auto in my code except iterations. Just a mental pattern. I understand that people coming from "typeless" languages might not like it at all, but it is like with guns. You can play with water pistol but you do not play with real guns, you follow some rules if you want it safe. Modern C++ in my opinion supplies enough features and tooling to help with making reasonably sure the rules are followed.
Citation needed? My team generally avoids auto and tends to spell out the types most of the time, which we find makes everything more readable. Code is written once but read many times...
Notice that he says "the main reasons to declare variables using auto are for correctness, performance, maintainability, and robustness—and, yes, convenience, but that’s in last place on the list.". He is arguing literally the opposite of what you do, that using auto avoids bugs ("for correctness").
I'm not going to argue in favour or against "Almost Always Auto". But I see no problem calling it "Modern C++".
imo types don't make things more readable. You know the "fluent programming" thing where you do obj.doThis().doThat().doThis().doAnotherThing() and probably 20 more lines of this? That's entirely because people didn't want to bother writing out LHS types, but whatever version of the language only let you avoid that if you chained things (there was no auto or var). Maybe it's more of a Java practice, though.
Our style guide says "almost never auto," but most people auto everything. They can make the linter replace auto with the actual type if it's really that big a deal. Explicit type wouldn't have avoided this bug.
Arguably an explicit type would have made the bug a little bit more obvious. You basically expect, by default, this kind of string manipulation functions to return a std::string. Obviously in this specific case you can return a view, and that makes it more performant; but it's kind of unexpected and using auto leaves the surprise a bit more hidden.
But sure, you can still make the error with expiclit types.
Not to detract from your general point, but are you sure your example corrupts memory? AsciiStrToUpper returns a std::string (it has to), which in your example becomes a temporary object. Temporaries are destructed at the end of the containing full-expression, which in this case is the whole assignment expression. So StripAsciiWhitespace returns a view into a still-living temporary, and foo2's constructor allocates memory and makes a copy. Only then is the temporary deallocated.
Now, if you wrote absl::string_view foo2 = etc., you'd have a dangling view for sure. In practical industrial usage, you'd build with an address sanitizer (it's built into clang: -fsanitize=address), which should catch that issue.
I missed one important little piece in my example, should be `auto foo2` (to make it more of a landmine) or `string_view foo2` and not `string foo2`, as you've noticed. There's an overloaded func taking string_view and returning string_view.
We use an address sanitizer in tests but not in production. IIRC we had one untested log output line like this that caused corruption in prod.
> Also I have to point out the irony in a language being type-safe but not memory-safe.
Misnomer aside, there is no contradiction. Type safety is much easier to reason about at compile time than memory management. So, it shouldn't come as a surprise. You need a rust like "borrow checker"-by-default functionality.
It's not surprising that they use static types in C++, but it's not about safety. It's about knowing the size of everything at compile time. Dynamic typing requires moving more towards the "interpreted" side of languages and incurring the performance cost of that.
The contradiction is when the veteran SWEs I work with say we use C++ instead of Python for the added safety of compile-time type checks, treating dynamic typing like a bigger risk than memory trampling (never mind that Python has type-checking too if you really want it).
For very large projects static types in C++ are a lot safer than Python. For simple projects I agree with you, but as you get over 50k loc Python becomes hard to manage, while i'm work in 10 million lines of C++. Sure it takes more effort, but at that size C++ is a lot easier to manage.
C++ does have issues. However few languages can handle complex problems well. (Rust is very intriguing for the possible ability to do things at the same scale)
10 million C++ lines is 10 million chances for you to misuse memory and cause your entire program to behave unpredictably, from simply crashing to writing garbage directly into your database. Imagine you're on a team of 8 and find one day that your DB's indexes were mysteriously corrupted, what do you do next? It's happened here.
If you use the wrong type in Python, you get an unhandled exception failing to find a property or something, that's about it. Maybe there's a contrived scenario where it'll succeed in a wrong way, but again that's a smaller blast radius
The likelyhood of Python crashing is a lot higher as it is much harder to know you didn't break your error handling path someplace that isn't tested. C++ sometimes has memory issues, but not nearly as often as Python has problems in the error paths.
Python is exception-based, so it's easy to avoid a crash. Say it's a webserver, you catch all exceptions around each endpoint and give HTTP 500 if it's not something you understand to be a 4xx. Exceptions are nice in that you either explicitly handle them or the caller will.
C++ has exceptions too, but I've never used them, so idk. The abseil statuses end up being used like DIY exception handling, and it's ok but a bit easier to mishandle something than in Python. And a segfault cannot(? or shouldn't) be caught and will crash the whole program.
I am not writing a web app, I'm writing an embedded system. Catching an exception is nice, but I still need my code to keep working, which memory leaks do allow, generally for a long time. yes we have had some 'memory scribbler' bugs that were a pain to track down, but they are very rare compared to changing python code and missing to make the right change to an error handling path and now Python unwinds to main instead of handling the error correctly. Note that I'm saying Python errors of that nature are more common despite comparing 50k Lines of Python to 15m lines of C++.
For short programs Python works great, but it doesn't scale to large programs.
You can comfortably write a large web app in Python or similarly in JS, and people do. For embedded, it's already out of the question for other reasons. And you probably don't want exception-based handling in embedded, yeah.
The problem is C++'s priorities. Priority #1 is performance, and if you look closely, nearly every single instance of this buggy-program hostile attitude comes down to the language and libraries absolutely refusing to do any kind of dynamic safety checking by default. UB can be traced directly from "it's too hard to reason about the implications of buggy programs" to "well, don't do that, stupid programmer".
There's a talk by Scott Meyers (at the D lang conference IIRC), where he methodologically takes apart the claim that priority #1 is performance or 0-cost abstractions. Basically the way the standards body seems to work is there's ~N different concerns and people will whatever set is appropriate to kill proposals. It gives a fantastic illusion of performance being #1, but C++ makes plenty of choices that are misaligned with performance (e.g. variable aliasing that it inherited from C but it let infect the C++ type system too).
There are tradeoffs. If performance were always #1 with the underlying requirement of portability, you'd be using C. And performance is more of a concern in C++ than in Java.
I'm no expert in this, I'm just a guy who's sick of writing web backends in C++ for no reason.
You'd think so, but C doesn't have the performance crown always either due to language design choices. A big one is aliasing (which languages like Rust and Fortran forbid) which inhibits the ability for very impactful and common low-level optimizations that come up all the time. Performance characteristic differences between C vs C++ are not all that interesting because the language models are so similar (& thus similarly the compilers for them basically have the same model). Indeed, higher level of abstractions can help improve performance as virtual classes will outperform manual attempts at doing similar things (due to devirtualization passes within the compiler). That's actually been a friction point for Rust using LLVM because LLVM is built around the C/C++ language model and it's hard to express certain invariants to it that would result in even more efficient code or you try to use optimization passes that turn out to be broken because they're not really used by the broader C/C++ community (https://github.com/rust-lang/rust/issues/54878).
As far as I can tell, Priority #1 is backwards compatibility. For example, std::regex can be slower than launching PHP and running the regex there¹, but that won't be fixed because it would require an ABI break. I'm not sure how that happened, as neither the poor performance nor the unwillingness to change ABI were unknown. Performance seems to not have been the top priority in the design of std::unordered_map either, which requires reference stability and therefore boxing. This got added in 2011, 18 years after the birth of the stl.
You can implement your own libraries that do this easily. The std libraries have to maintain backwards compatibility and typically solve the base case only - just wrap them to do whatever ref counting or bounds checking you want
This is simply false, modern C++ just replaces old issues with new ones. There are numerous new unsafe footguns that "modern" C++ introduced with lambda captures, coroutine lifetimes, and libraries like std::range where lifetimes and ownership are incredibly difficult to reason about. Not to mention that the web is full of blog posts about how to properly use std::string_view, since a lot of crashes trace back to improper use of it, particularly its constructor that kind of acts like it takes ownership but doesn't.
> Modern C++ has made big inroads into removing memory safety issues though
Modern C++ adds new features that allows you to write code that doesn't have as many memory safety problems, but all the unsafety (is that a word?) that existed in the past still exists.
Removing features that allows memory unsafety will probably never happen, because one of the main reasons people use C++ is all the legacy code will still be used for decades.
It's great that you can write C++ using only the newest features, and that does increase safety. But that doesn't offer the same guarantees of languages designed to simply not have these kinds of problems, or that by design flag the points in the code where safety can't be guaranteed by the compiler.
C++ has so many features that it looks like a different language, depending on who's writing it. It can still compile decades-old code written to the oldest standard, or brand-new code with the latest memory safety features.
So is there a compiler flag or external linting tool that can generate warning messages when you're using an older memory-unsafe technique where a safer and more modern technique could replace it?
You can write completely memory-safe code using nothing but C++98 features, like templated smart pointers and whatnot, which wrap around the unsafe parts of your code. You can keep that unsafe part small and get it right "by inspection"; then if it's used through the higher level primitives, it cannot be misused.
The problem is that there's no way to automatically check any of that.
If the code is small and doesn't change or doesn't have to last for too long, that's OK. But when it's changed over the years, it's pretty much impossible to guarantee that the unsafe parts are kept small and contained, especially as programmers leave and new ones come in.
Note that std::vector's at() predates C++11, whereas std::span was added in C++20. It's a myth that C++ is getting more memory-safe; in many aspects it's actually regressing.
Regardless of all that, you can make your own vector class which works exactly how you want and is completely safe.
In a greenfield project where you control every line of C++ code, you can easily achieve very good safety using nothing but ancient C++ features.
Not just memory safety. I mean, you can make your own numeric types that do overflow checks and throw exceptions or whatever. It probably won't be very fast, but it will be solid.
People have used C++ (ancient C++) to make numeric types with units, where you can't add "kilograms per second" to "meters". I remember that from some talk thing I went to in 1999.
std::span doesn't have at() because std::logic_error was a mistake. Using std::vector's at() is a mistake (you could use it as a helper method and make it as if it throwed std::runtime_error, but it doesn't, that wasn't its intended usage).
All three major C++ standard library implementations have an option to enable assertion checks in both std::vector and std::span operator[]. Enable them unless you are in such a resource limited environment that you can't afford the checks (unlikely). The only reason the standard doesn't require the checks is to allow C++ to be usable in those resource limited environments. You could argue the checks should be enabled by default, but that's not something for the standard to decide, complain to your standard library vendor.
It's not just the indexing you have to worry about, though. It's also out-of-bounds iterators. And while, yes, you can tell your implementation to emit checks for those as well, it's so slow in practice that nobody uses it in optimized release builds (and some implementations don't even support such use).
> Modern C++ has made big inroads into removing memory safety issues though. Memory safety bugs are relatively uncommon in modern C++ compared to pre C++11 times.
I remember when people said the same thing about C++03. Plus ca change.
It was true, which makes it silly to be saying it about C++11, C++14, C++17, ...
In C++98 you could develop a program with some unsafe core of whatever resource management you wanted, and then wrap it with the right classes so that it can't be misused.
After circa 1995, anyone struggling memory safety issues in greenfield C++ code that was completely under their control (no legacy) had to be a supreme goofball not to be leveraging the language to make that sort of thing go away.
And yet we keep seeing memory safety issues in C++ code even when that code was written greenfield long after 1995. (E.g. IIRC cloudflare's cloudbleed was in a codebase that had been started in about 2012).
Looking at this, that is not clear. The report below (courtesy of Wayback Machine) mentions some external HTML libraries, as well as the development of modules that plug into NGINX, and work on raw memory buffers (I'm guessing, dictated by NGINX). Hard to speculate without seeing the code. Not that it's an excuse; greenfield C++ code could find ways to interface in a bullet-proof way with something or anything that communicates using low-level buffers.
The "Ragel" tool they used evidently generates C that uses raw pointers:
> The Ragel code is converted into generated C code which is then compiled. The C code uses, in the classic C manner, pointers to the HTML document being parsed, and Ragel itself gives the user a lot of control of the movement of those pointers. The underlying bug occurs because of a pointer error.
The bug in these Ragel-generated parsers was somehow hidden or compensated for by some buffering strategy, which they tweaked when introducing some new kinds of parsers "cf-html". Those didn't have the bug, but the different buffering turned on for them exposed the bugs in the Ragel based parsing.
I'm looking at the Ragel State Machine Compiler user guide. Chapter 5 (Interface to Host Program) makes it quite clear what sort of thing the Cloudfare people chose to grapple with. Ragel will write code for you planted into middle of any C function anywhere; you must provide numerous predefined variables, under prescribed names, some of which are pointers to data, and so it goes. For some languages, there are safer interfaces: for Java and Ruby there is a buffer and instead of a pointer there is an index into it. Ragel could have been upgraded to have actual C++ support of some kind.
If C++ adopted a way to namespace safe or unsafe code (unsafe by default would keep retro-compatibility) and had the tooling needed to catch memory safety bugs at compile time, that would be enough for me.
The effort needed on tooling would be significant though. I don't see that happening and overtaking Rust.
(btw the correct spelling is Achilles, Achilleus, Akhilleus, Ἀχιλλεύς)
GNU and Clang toolchains have useful diagnostic abilities in this direction, though no single "master switch".
For example, with -Wold-style-casts you can diganose every use of the (TYPE) EXPR casting notation, which is often seen in the lower-level C-like C++ code for punning memory.
Somewhere in some commonly included header for the project you can write declarations for C functions that should not be used, marking them deprecated. Then if people introduce strcpy or malloc or whatever you don't want, that can be diagnosed (and can fail compilation, if desired).
Most programmers sadly keep using C idioms even while writting "modern" C++, just look at recent samples provided by companies with seat at the ISO table, some of which have even provided famous security reports.
> > And C++ just... doesn't have that many real problems.
> It does not have many, but it has one and it's big. It's not memory safe.
It's not as big as a deal as c++ opponents make it out to be. The ratio of memory bugs to other bugs at work is basically zero. The problems that are holding our product back have nothing to do with memory safety. Of the long list of problems we need to solve, that just isn't one of them. If the only thing a new language offered over c++ was that I don't have to think about that 0.01% of bugs anymore, but in exchange I get either a garbage collector or a borrow checker to fight with, I would not switch.
>"It does not have many, but it has one and it's big. It's not memory safe."
Enough of this already. Not going to discuss ancient code. I currently using modern C++ and it has enough features to keep one's code reasonable safe if that was the goal. And frankly I do not remember any reported bugs in my production caused by "memory unsafety". Just plain old logic errors every once in a while.
The computer isn't "memory safe." If you want a language that can express every function of the hardware it runs on, you expose this fact as well.
Careful management of "unsafe" blocks might be an answer, but I suspect we're going to have to dig way deeper until we find a truly "safe" solution. At that point, would it actually matter which language you use?
I'm saying fix them. I'm also saying, I don't think the language is the appropriate level to do this at.
You can see the clear conflict in the mentality of these new languages. "Fast and, somehow, safe!" It's missing the forest for the trees, much like your retort.
> In Android 13, about 21% of all new native code (C/C++/Rust) is in Rust. There are approximately 1.5 million total lines of Rust code in AOSP across new functionality and components such as Keystore2, the new Ultra-wideband (UWB) stack, DNS-over-HTTP3, Android’s Virtualization framework (AVF), and various other components and their open source dependencies. These are low-level components that require a systems language which otherwise would have been implemented in C++.
> To date, there have been zero memory safety vulnerabilities discovered in Android’s Rust code.
> We don’t expect that number to stay zero forever, but given the volume of new Rust code across two Android releases, and the security-sensitive components where it’s being used, it’s a significant result. It demonstrates that Rust is fulfilling its intended purpose of preventing Android’s most common source of vulnerabilities. Historical vulnerability density is greater than 1/kLOC (1 vulnerability per thousand lines of code) in many of Android’s C/C++ components (e.g. media, Bluetooth, NFC, etc). Based on this historical vulnerability density, it’s likely that using Rust has already prevented hundreds of vulnerabilities from reaching production.
But sure, keep telling yourself that it’s possible to write C++ code of this quality by being more careful or using some magical sanitizer or by hiring better developers or whatever.
"And the reality is that there are no absolute guarantees. Ever. The "Rust is safe" is not some kind of absolute guarantee of code safety. Never has been. Anybody who believes that should probably re-take their kindergarten year, and stop believing in the Easter bunny and Santa Claus."
I didn't claim that Rust provides absolute guarantees. I don't know anyone who has claimed that. The only place I've heard that being claimed is people saying it's commonly claimed so by Rust advocates.
Even in my comment I quoted a passage that said "we don’t expect that number to stay zero forever". This is important. Although it has been successful so far, there will be bugs in it, even the odd memory safety bug.
That's still progress! Fewer bugs than before is progress, (mostly) eliminating a class of bugs is progress. I only addressed a person who was saying "there will still be some bugs, so there's no point tackling this at a language level". They're unable to grasp the idea of progress.
It does not have many, but it has one and it's big. It's not memory safe.
Check out this talk by Herb Sutter, around minute 43:
[ Note: in this context, the "government" is the US Department of Commerce and NIST, which, according to Herb, have issued a detailed guidance in response to the Executive Order on Cybersecurity [2]. While I couldn't find that guidance, I found one by the NSA [3], which saw quite a few comments here on HN in December last year [4] ].[1] https://www.youtube.com/watch?v=ELeZAKCN4tY
[2] https://www.whitehouse.gov/briefing-room/presidential-action...
[3] https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI...
[4] https://news.ycombinator.com/item?id=33560227