Hacker News new | past | comments | ask | show | jobs | submit login
C++ Safety, in Context (herbsutter.com)
145 points by ingve 8 months ago | hide | past | favorite | 364 comments



> All languages have CVEs, C++ just has more (and C still more); so far in 2024, Rust has 6 CVEs [1], and C and C++ combined have 61 CVEs [2]. So zero isn’t the goal; something like a 90% reduction is necessary, and a 98% reduction is sufficient, to achieve security parity with the levels of language safety provided by MSLs

The author is making a massive assumption that all CVEs are equally serious but they aren't. Opening the Rust links indicates several of them are denial of service, including regular expression denial of service. Not downplaying this, but compare it to the first result in the other link, which involves potential remote code execution (RCE).

Take for example CVE-2022-21658 (https://blog.rust-lang.org/2022/01/20/cve-2022-21658.html) in Rust, related to a filesystem API. It's true, this was a CVE in Rust and not a CVE in C++, but only because C++ doesn't regard the issue as a problem at all. The problem definitely exists in C++, but it's not acknowledged as a problem, let alone fixed. That's why counting CVEs alone is meaningless.

The author concedes that CVEs are not a good metric to measure by, but implies that maybe C and C++ have too many CVEs that shouldn't actually be CVEs and that the C++ community should take more control of CVEs being filed. This is ominous, and makes me fear we might start to see fewer C and C++ CVEs because issues will be closed as "intended behaviour" or "won't fix", like the filesystem issue above.

[1] - https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=rust

[2] - https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=c++


> It's true, this was a CVE in Rust and not a CVE in C++, but only because C++ doesn't regard the issue as a problem at all. The problem definitely exists in C++, but it's not acknowledged as a problem, let alone fixed.

Can you find a link that substantiates your claim? You're throwing out some heavy accusations here that don't seem to match reality at all.

Case in point, this was fixed in both major C++ libraries:

https://github.com/gcc-mirror/gcc/commit/ebf6175464768983a2d...

https://github.com/llvm/llvm-project/commit/4f67a909902d8ab9...

So what C++ community refused to regard this as an issue and refused to fix it? Where is your supporting evidence for your claims?


Saying it was fixed in two of the three C++ standard libraries is irrelevant, the language standard itself specifies that the behavior is undefined.

This would be like saying C++ the language fixed buffer overflows because GCC added bounds checking. Most sensible C++ developers know that you should not depend on undefined behavior to write correct software, and yet your argument that because some implementations (not all) have decided to provide semantics for this, that it's now okay to use it or no longer a problem.


Most sensible developers develop software against compiler specifications, not the standard. Very very little useful software can be implemented strictly within what's offered by the standard.


That's very not a reason to justify the critical shortcomings of a standard, especially so when implementations are known to make their practical safeties regress in the name of they are allowed by the letter. In that context the very culture of C++ normalizers and implementers has to change and the introduction of this paper is a step in the wrong direction in that regard.


Realistically the only improvement to the spec is changing fs races from undefined behaviour to something less program-invalidating. But to what? Unspecified behaviour would require the standard to give a set of possible outcomes which might not be implementable. Implementation defined would still require the compiler to pick and document a specific behaviour which might also not be possible to guarantee.

The only way to provide stronger guarantees is to rigorously define the behaviour of the OS, which is of course not possible. Not even POSIX does that and of course C++ targets beyond POSIX.

The reality is that there are a lot of things that are commonly done that are formally undefined (for example mmap, ldopen, openmp) and the user has to look for details beyond the C++ standard and into other documents (other standards, the compiler manual).

The alternative is a fully defined isolated sandbox, but that would be pointless for a system language and not even Java attempts that.


> Unspecified behaviour would require the standard to give a set of possible outcomes

I don't think there is a requirement that the possible unspecified behaviors are enumerated. The current C++ draft [0] states possible behaviors are "usually" enumerated, but "usually" is not "always", and there's no explicit direction that those behaviors are the only allowable options.

There's also the definition from the C89 spec which doesn't even have that language, only stating that the standard imposes no requirements on the unspecified behavior [1].

[0]: https://eel.is/c++draft/intro.defs#defns.unspecified

[1]: http://port70.net/%7Ensz/c/c89/c89-draft.html#1.6


Or put another way, find a language, any language, that defines all possible scenarios of a file system race condition in a multi threading & multi process system. It's not possible to do such a thing, and of course nobody does. They just avoid using the term "undefined behavior" even though it absolutely is.

Which makes this whole thread absolutely absurd. It's the worst possible example of Rust vs. C++ CVE as the language doesn't get an opinion here at all in the first place


No, it's an excellent example.

Rust issued a CVE, an immediate point release with a fix for the issue and a blog post explaining the problem and what they did.

2 C++ implementations fixed the issue, but no CVE or blog post. No point release either AFAIK.

You harp on the fact that this is undefined in all languages. Yeah sure. But some languages take the report seriously and communicate that to their users. They don't hide behind "spec says UB" or "it's the file system's fault". They take accountability and fix it. Other languages don't because that's the prevailing culture there.

That's what you're missing when you're trying to make it seem like there's no difference between Rust and C++. There's a vast difference in how each community takes security. That's why its meaningless to compare the number of CVEs in both languages. Even if C++ reduced the number of CVEs by 90% it still would not be as secure as Rust because 1 C++ CVE is not the same as 1 Rust CVE.


You're moving the goalposts so fast you could be competing with C++ for prioritizing performance over soundness.

Reminder that your original claim was:

> The problem definitely exists in C++, but it's not acknowledged as a problem, let alone fixed.

Now it's degraded to just "but there wasn't a CVE or blog post!" which isn't even that relevant to the broader argument of Herb's that all the language guarantees don't prevent logic bugs (hence how Rust was able to have this CVE in the first place). There's a point of "good enough" for the language itself.

Nobody is making any argument that CVE count is the best or optimal metric for anything


I didn't move the goalposts. I was going off my recollection at the time, which was a discussion around the Rust blog post and CVE. Here's a comment from me a day ago saying "I stand corrected" (https://news.ycombinator.com/item?id=39680754).

> Nobody is making any argument that CVE count is the best or optimal metric for anything

Except, you know, Herb Sutter in the article we're supposedly discussing. He's arguing that it's possible to compare Rust and C++ CVEs and that C++ would be just as good as Rust if the CVE counts were similar. If you took the trouble of addressing the substance of my comment that Rust and C++ CVEs aren't comparable, rather than exulting in being technically correct because you found a mistake, you would have realised that.

You're welcome to assume bad faith of me, but that was the substance of my comment. I'm sorry I didn't keep in touch with every commit made to C++ compiler repos and I spoke out of turn. But I never moved the goalposts - those were always fixed on the issue that comparing CVEs between a language that takes security seriously and one that doesn't is a dumb idea.


> Except, you know, Herb Sutter in the article we're supposedly discussing

"Saying the quiet part out loud: CVEs are known to be an imprecise metric. We use it because it’s the metric we have, at least for security vulnerabilities, but we should use it with care. This may surprise you, as it did me, because we hear a lot about CVEs. But whenever I’ve suggested improvements for C++ and measuring “success” via a reduction in CVEs (including in this essay), security experts insist to me that CVEs aren’t a great metric to use… including the same experts who had previously quoted the 70% CVE number to me. "

-Herb Sutter

That's from the article we're discussing and you knew that because you also acknowledged that:

> The author concedes that CVEs are not a good metric to measure by, but implies that maybe C and C++ have too many CVEs that shouldn't actually be CVEs and that the C++ community should take more control of CVEs being filed. This is ominous, and makes me fear we might start to see fewer C and C++ CVEs because issues will be closed as "intended behaviour" or "won't fix", like the filesystem issue above.

So you know that Herb isn't arguing that CVEs are ideal. You instead took a wrong baseline assumption (that C++ didn't care about the FS issue) and turned it into some weird claim that the result will just be the C++ community refusing to fix or acknowledge CVEs in order to drive the count down.

You're arguing two conflicting ideas:

Point 1: C++ is bad because there's no one ensuring CVEs & blog posts are filed for issues

Point 2: C++ is bad because a central CVE would just let them hide the issues they're desperate to hide

Despite your only evidence for either of these being something you made up entirely, as you eventually reluctantly admitted.


Those points aren't contradictory. C++ is actually so terrible that

1. They're underreporting CVEs as of today. You acknowledge that a CVE should have been filed for the FS issue but it wasn't. I say it wasn't because the community has a cavalier attitude towards security and they didn't think it was worthy of a CVE. Nothing you've said contradicts this

2. Herb Sutter argues that there should be more control over what CVEs are filed because he's not happy with the ones being filed today, which may merely be bugs and not vulnerabilities. This may make the situation worse, with even fewer CVEs being filed. That's juking the stats.

Herb does acknowledge that CVEs aren't ideal, but the entire article is based on getting the number of CVEs in C++ down to a level comparable with Rust. He wouldn't have set that as a goal if he truly thought CVEs were meaningless.

By the way, I'm still waiting on the link for the fix in MSVC. Or did you give up on it because you saw steveklabnik's comment where he links to an MSVC maintainer saying there's no point fixing it in MSVC without changing the spec? (https://old.reddit.com/r/cpp/comments/151cnlc/a_safety_cultu...).

> something you made up entirely, as you eventually reluctantly admitted

I didn't make up anything, nor did I admit anything. My information was outdated, and I said I was wrong. You're accusing me of lying when I didn't, which make it hard to interact with you.

I say only this. Don't harp on a mistake I made and admitted to, and instead address the substance of what I said.

1. CVEs between languages aren't comparable (you don't dispute this).

2. C++ community should have filed CVEs for this issue but didn't (you don't dispute this).

3. The lack of CVEs indicates that they have a much lower standard for security.


That sounds like the hallmark of a defective language to me.


Loads of very popular languages don't have formal specifications at all:

* Rust

* Python

* Ruby

* PHP

* TypeScript


Ruby is weirder than that: it has an ISO standard, but that standard is 100% irrelevant.


Funny, Matz mentioned about it in RubyKaigi 2023 keynote, that it was made with the hope that it gets wider adoption in the industry, maybe even used in technical exams. It didn't get the expected result and he felt like it was a waste of time and effort.


I'll have to track that down! This is a bit of a spicy take, but I do think of the Ruby ISO spec whenever people say that Rust needs to have one or else it won't get real adoption...


ISO - and more specifically SC22 of JTC1 which is where a programming language ISO committee would live - is a terrible place to develop anything like this. It's almost designed to be unsuitable, and the insane thing is that JTC1 wasn't even created until 1987. If you told me it was from the 1960s I'd say maybe they didn't know any better, but in 1987 the IETF is already running. WG21 (the C++ Standards Committee, in JTC1/SC22) took until the pandemic forced them to to stop insisting on 100% in-person meetings to make decisions.

Here's my line in the sand: After the Mother of All Demos, if you're trying to agree things internationally and you are physically moving people to a location to do that you're doing it wrong, but maybe you don't know that yet. After JIPS ceases to be a pilot project, not knowing you're doing it wrong means you're grossly incompetent.

Those two dates are somewhat arbitrary, I picked them deliberately but I can entertain other nearby events for the same purpose. In particular if you're in Asia or Africa the JIPS date seems very arbitrary, I pick it because in my opinion this outcome (The United Kingdom's tertiary education will use IP not X.series standards) firmly means the Third Network will be the Internet not the X.25 standard. X.25 can win if only the Americans do IP, but it can't once IP spreads and that's what JIPS is.

I'm far from convinced Rust should have standardization via some SDO, but if it did need that then I'm sure ISO is the last place to go. ETSI isn't great, but it's already better than JTC1. If some corporates insist on an ISO document, mint the document somewhere else and just get ISO to put their stamp on it, have the corp. pay for that - but tell anybody who cares to ignore the stupid ISO document.

You (Steve) undoubtedly know that C++ 23 is a thing, they signed off on it mostly in 2022 and a little bit of 2023, but because ISO is awful actually ISO/IEC 14882:2023 doesn't exist, they are still working through the tedious process of agreeing to publish a document everybody settled on twelve months or more ago.

This process was fine for, I dunno, standardising how the plastic joints fit together on a water pipeline. Maybe it takes a few years to nail down exactly the text, you standardise it, nobody needs to revisit for decades at least. It's stupid for a programming language.


Similar to TypeScript, which has a standard. [1]

Last updated Jan 2016.

[1] https://javascript.xgqfrms.xyz/pdfs/TypeScript%20Language%20...


I could be mistaken but I believe Python has a formal spec, no?

https://docs.python.org/3/reference/index.html


most languages don't even have a specification.


Yes! This is much under appreciated.

Usually, it’s just some core calculus of the language that is rigorously specified and the rest is hand waving.

There are some exceptions like JS.

But even Java has the problem that if you just implement what’s in the spec, you won’t be able to run anything meaningful unless you also do things exactly how the JDK would. You can find out what the JDK does by reading its code and writing test cases and I think that’s what folks do, if they want to be compatible.

UB and memory safety are orthogonal. If we specified formally and super rigorously that a pointer is an integer and that memory is an array of bytes, we could have a UB-free language but memory safety would still be on fire.


> If we specified formally and super rigorously that a pointer is an integer and that memory is an array of bytes, we could have a UB-free language

That's PVI (Provenance Via Integers) and it's a performance disaster. If anything in memory might be pointed to, almost all the nice modern optimisations aren't correct. It is really popular with a certain kind of "Portable assembler" programmer, who typically has no idea how the machine actually works, nor how their language is defined but is very confident the nonsense they're writing ought to do what they wanted it to do.

So, the "bad" news is that you can't have this, your compiler vendor won't make it, and the "good" news is that you'd have hated it anyway which is why they won't make it.


I don't think it is same as PVI, because i) there are a lot of possible non-determinisms still allowed and thus to be exploited, and ii) the specification will have to require only the observational equivalence anyway because every optimization will be invalid otherwise. It should be definitely possible to define a very precise machine without actually mandating PVI.


Everything is memory safe if your only datatype is uint8_t /s


Yup :-)


The issue isn't that C++'s specification has an issue. The issue is that the same issue resulted in no CVE for any C++ vendor. And most languages tend to have 1 reference de facto implementation whereas C and C++ are quite unique in having 2-4 mainstream ones in regular use (& the C++ frontend is ridiculously complex). Java, Python and C# are the only other mainstream languages with a formal spec and only Python that I know of maybe has alternate frontends (there are multiple runtime implementations for C# and Java but I don't believe the language -> bytecode part is different).

JS is maybe closer on this front but it's also quite old & JS is also a mess & a lot of development has shifted to TS as the language for those reasons & TS only has 1 frontend & no formal spec.


There are at least two Java bytecode compilers. Though javac is obviously the "reference", there is also egc. It's used primarily by IDEs and editor plugins (like Eclipse, from whence it came, and the RedHat Java plugin for VS Code).

Still, if memory serves there have been a handful of cases where egc's implementation of the spec differed from javac's with resulting fixes in javac itself (though I don't have sources at hand, so perhaps I misremember).


That is an even more grievous deficiency than an inadequate spec, but it hardly means that a bad spec is good.


> Saying it was fixed in two of the three C++ standard libraries is irrelevant, the language standard itself specifies that the behavior is undefined.

Where does it say that? Please point to the spec that says remove_dir is allowed to have TOCTOU security bugs in a system with multiple processes.


There is a comment below pointing to STL himself saying that it is https://old.reddit.com/r/cpp/comments/151cnlc/a_safety_cultu...

He doesn't cite it, but if there's anyone I'd trust to have correct information here, it's him.


The UB is actually much broader, the standard just says it's UB if there is other software which touches files while you're also touching them, it's basically just always potential UB to run C++ application software with the filesystem API on a multitasking system.

"A file system race is the condition that occurs when multiple threads, processes, or computers interleave access and modification of the same object within a file system. Behavior is undefined if calls to functions provided [...] introduce a file system race."


That's also UB in Rust, Java, C#, etc...

There's no language anywhere where a different process interacting with the file system at the same time isn't UB


How is it UB? The behaviour seems reasonably defined to me in my Rust, my Java, my C#. The people delivering popular implementations of the C++ standard library seemed to feel that not having UB here was a significant Quality of Implementation issue too. The ISO document on the other hand insists it's UB.


I stand corrected that the issue wasn't fixed (https://news.ycombinator.com/item?id=39680006). The issue was fixed, but no CVE by the C++ libraries. That reinforces the point that the author's attempt to equate 1 Rust CVE = 1 C++ CVE isn't valid.


> Can you find a link that substantiates your claim? You're throwing out some heavy accusations here that don't seem to match reality at all.

Relevant piece of the standard: http://eel.is/c++draft/fs.race.behavior#1.sentence-2

Officially it's undefined behavior.

And here's a comment from the maintainer of the STL in MSVC regarding this: https://old.reddit.com/r/cpp/comments/151cnlc/a_safety_cultu...


That statement really doesn't support your claim and unsurprisingly Rust has the same basic caveat because of course it does, it's not possible for a language to guarantee exclusive IO access to the filesystem.

https://doc.rust-lang.org/std/io/index.html#io-safety


The Rust docs say something substantively different about safety from what the C++ standard says.

The Rust docs are talking about the safety and soundness of keeping track of the fd ownership. The C++ standard says that FS races are UB.


FS races in Rust are UB as well, they're just scared to say that very explicitly. They hide behind this phrasing:

"Many I/O functions throughout the standard library are documented to indicate what various library or syscalls they are delegated to. "

And just think about it rationally for a second. The FS is implemented by the kernel, which doesn't give a damn what language you're using. So Rust cannot possibly guarantee anything about it. Everything about Rust's FS API is every bit as full of UB as C++'s because it's a shared system with external processes - Rust can't guarantee shit. They just make you go look up what the syscalls are to find out what the guarantees (or lack-thereof) actually are instead.


> Rust can't guarantee shit

And yet, the Rust community takes it seriously, issuing a CVE, a point release and a blog post. You won't find these 3 from all 3 major C++ implementations (clang, gcc and msvc) because they don't take the issue as seriously because their spec says they don't have to.

Rust can't guarantee shit but at least they do shit when there's a problem. They take accountability for their shit.

You keep harping on minor nitpicks, but you can't escape the fact that the C++ community did not take this issue as seriously as Rust did. Therefore it is meaningless to compare CVEs across the two ecosystems, which was the substance of the top level comment.


> because they don't take the issue as seriously because their spec says they don't have to.

And yet they all quickly fixed the issue and nobody tried to hide behind "the spec"

Should there have been a CVE? Probably. But there's no central C++ CVE committee to have done that, either, which to Herb's point there probably should be.

Should there have been a blog post? Probably not. People don't ship their own standard library, so the blog post is low value. What matters is whether or not distros/OS's picked up that fix promptly. Which unfortunately is hard to find out since most Linux distros suck ass at keeping their standard libraries up to date.


> they all quickly fixed the issue

I notice you linked the fixes for clang and gcc. Where's the link for MSVC?

> Should there have been a CVE? Probably.

Glad you agree. But the absence of it proves the point I'm trying to make - they didn't take it seriously enough.

> there's no central C++ CVE committee to have done that, either, which to Herb's point

It actually sounds like Herb wants to reduce the the number of C++ CVEs that are filed, not increase them. He very specifically says that a bug shouldn't be enough, there should be a vulnerability. It sounds like he wants to achieve his goal of fewer CVEs by juking the stats, but who knows.


Even worse, in fact. While one can happily argue (non-)justifications for this till everyone's blue in the face, filesystem interfaces (in UNIX, but with a little magic sprinkle in Windows as well) have always allowed for concurrent access to the same file via two different handles. Open the file twice, and use the two - entirely "independent" objects for any language - filedescriptors to change the contents underneath each other. System behaviour here allows for things that "rust as a language" does not (* - I know about inner mutability but that's a different thing; even the data retrieved from a readonly-opened file can change if a second writeable open happened on it). In the end, programming languages and their standard runtimes depend on the behaviour of the operating system. I actually love that rust exposes this via system-specific traits.

The "extreme" would be to go the "Oberon Way" - write the system for the language that implements the system written in that language. Maybe we'll get "somewhere there" with rust one day. Maybe not. Personally, I don't see the value in it, but mileage may vary.


You pointed at the I/O safety notice in Rust's documentation, but now you've pivoted to saying you were really talking about something different.

But wait a second lets go look at that notice again. Rust is yet again delivering a safety feature C++ just does not have. This actually wasn't in Rust 1.0 -- the safety notice didn't appear then because the I/O safety work landed much later. These safe I/O properties are really useful (even if they're not magic, hence the notice) and C++ doesn't have them.

What's particularly going on here is that on a Unix Rust has types named OwnedFd and BorrowedFd which represent file descriptors, these are quietly just a 32-bit integer with a niche, because -1 isn't a valid file descriptor, that's a niche, so Option<OwnedFd> is the same size in memory as a 32-bit integer.

The result is that Rust gets to do the same trick with file descriptors as it does with pointers - a Rust program working with fds uses the same size data structures as you'd write native C (or thin C++) to work with file descriptors, but where that C or C++ would need to remember to write checks for the -1 file descriptor in Rust that's seamless anyway because it is Option<OwnedFd> so that's None.

On Windows they don't have file descriptors, but they do have Handles, the Handles are much more muddled, and so we can't optimise them as well (actually it's a wonder Windows manages to keep all these balls up in their air they're juggled so frantically, there's a lot of references to Raymond Chen's blog) but again Rust provides I/O safety for these types.

This is the purview of a programming language and Rust does a much better job, I wouldn't have gone out of my way to call attention to it, but you did apparently because you mistook this feature (which C++ doesn't have) for a different feature which C++ is bad at.


Everything you just talked about is unrelated to the issue being discussed. This bug isn't a mismanagement of FD lifecycles, which yes Rust will do more safely out of the box than C++. This bug is a TOCTOU issue ( https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use ) and, more broadly, concurrent access of the file system by multiple processes. Rust cannot guarantee anything about that.

Just because you have an FD and you know it's not -1 and you never forget to check that, that doesn't mean anything at all with respect to the concurrent access to the underlying ionode by multiple process for which that FD references.

Also it's very easy to make the equivalent of an OwnedFd in C++, eg https://cs.android.com/android/platform/superproject/main/+/...


I understand that the original topic was the TOCTOU bug but again, Rust actually makes explicit promises about what happens here, which it deliver via the appropriate filesystem APIs, whereas what the ISO document tells you for C++ is just you get Undefined Behaviour - absolutely anything might happen - the popular implementations do more but that's not what the standard says.

I mentioned what I/O safety really is because you've stumbled onto it while scrabbling to justify the belief that Rust also has UB for the TOCTOU bug.

And while Android's ScopedFd is more or less what you'd do in C++ it has a number of significant differences, which I'd say are disadvantages:

1: Most importantly OwnedFd is part of Rust's standard library.

2: ScopedFd insists on behaving like an integer, which is very typical in C++ but not in Rust. If we want a File Descriptor, who cares that those are "actually" integers? Thus we can compare a ScopedFd to an integer - is this ScopedFd more than ten?

3: And so ScopedFd can represent an invalid file descriptor. We don't own that of course, and it's not really a file descriptor at all, but we can (and Android does) represent it anyway. OwnedFd only represents valid file descriptors, an Option<OwnedFile> represents the wider category of either a valid file descriptor or not if that's what you meant.

4: Android provides a borrowing mechanic here, but of course doesn't have a borrow checker, so again you don't actually get the safety benefit of BorrowedFd.


Not to mention the weird conclusion that since no language has 0, that isn't the goal. I'm not sure I understand the logic that you shouldn't at least _try_ to not have any major security flaws. My interpretation of the CVE counts he mentions is that if your goal is zero, you might end up still having a few, but if your goal is just "few enough that people stop complaining about us being worse than other languages in a similar niche", you probably will end up not hitting that threshold either, which seems like a plausible explanation for how C++ is in this situation in the first place. The fact that he brings up C a bunch also seems like it could be related to this; it sort of feels like he's focusing too much on the idea of security as a competition between languages rather than something that's inherently worthwhile as a goal in its own right.


The point is that trying to hit 0 would be a huge breakage that would fragment the language, and there's no actual evidence to suggest such a thing is actually necessary.

It's like your front door's lock. It doesn't have to be unpickable, unshimmable, absolutely secure. It just has to put up more of a fight than a brick through the window.


I think the problem with this analogy for me is that while you might not care to have your home be 100% impossible to break into, you _do_ want to aim for it to be broken into zero times a year. If my house got broken into 60 times a year for five years, and my neighbors got broken into 6 times a year for the same period, I still think my long-term goal would be to have my home _never_ broken into, even if it meant moving to a different neighborhood.


Right, but security bugs don't all come from the language. They also come from logic errors. That's the "window" analogy here. If the language (door) is no longer the way people are getting into your house, then that's job done for addressing the language and you need other tools to focus on other issues that now are the problem.


right, so what you decide to do is quit your job and never leave the house so no one ever has the opportunity to break into your house.

Does the cost of doing so justify being 100% secure?

most people would say no.


Quit my job? I'm remote bb BD

The best of both worlds: performance and security. Brought to you by Rust. (Even though I actually write C++ from home...)


ok that's a scenario I didn't fully consider, lmao.

but humor aside, the point stands. safety/security is about tradeoffs.


> safety/security is about tradeoffs

I don't disagree with this, but I'm struggling to understand how aiming for zero CVEs would somehow be too onerous a tradeoff when six is reasonable. Assuming that nobody wants to have any CVEs in their codebase, the idea that ending up with six is reasonable but aiming for zero is preposterous sounds like another way of saying "it's easy to accidentally miss six future CVEs in your codebase". If that's the case, how can you have any degree of confidence that by aiming for six, you won't end up with 12 instead?


there's a reason people say things like "actions speak louder than words".

It's easy to say "safety is about tradeoffs" but then when you follow it up with an insistent that no tradeoffs should be made it kind of makes it seem like you're just saying that to appear reasonable rather than actually being reasonable.


Yep yep. A better analogy might be that the door has to eliminate quick, quiet entry to the house since that would deter a huge majority of would-be burglars just due to the vastly increased danger of getting noticed.

We have locks on windows, but rarely bars. We have alarms that trigger sirens, but rarely indoor locks. Eliminate quick and quiet, usually good enough.


Or, to stay within the metaphor, more of a fight than the door of your neighbor.


Yeah.

What I like about the 100% memory safe goal is that it’s a falsifiable goal.

Even the Rust style goal of “you’re memory safe if you do 100% rust and never use unsafe” has the nice property of being falsifiable.

98% memory safe is not a falsifiable goal. It gives the C++ designers the option of never actually fixing the problem while claiming they had by picking a sloppy way of measuring the “%”.


> Not to mention the weird conclusion that since no language has 0, that isn't the goal. I'm not sure I understand the logic that you shouldn't at least _try_ to not have any major security flaws.

He addressed that, the cost of making it to 0 would be too great (C++ would have to break backwards compatibility) so we should try and be inline with other languages instead.

I don't understand why you're acting as if he didn't make the point he made.


> He addressed that, the cost of making it to 0 would be too great (C++ would have to break backwards compatibility) so we should try and be inline with other languages instead.

> I don't understand why you're acting as if he didn't make the point he made.

My confusion is that I'd expect breaking backwards compatibility to either be completely off the table or for the amount of breakage allowed to be up for debate. If you're not willing to break compatibility at all, I feel like the goal should be to shoot as low as possible without breaking anything; if it's possible to get as low as other languages, why stop there? If you're willing to sacrifice some backwards compatibility, why not be willing to break it a little more to eliminate the last few sources of unsafety?


it's not clear to me that you read or understood the article, all of your posts certainly feel as if you didn't.

He explained why 0 isn't the goal, you continue to act as if he didn't. I don't know where else this conversation can go without you going back and better understanding his actual point.


If the discussion requires that I find his explanation convincing rather than being able to think that it's not sufficient, then yeah, I guess there's nowhere else for it to go.


The CVE/CVSS system lacks ability to deal with soundness issues.

If the language or a library promises to catch a mistake, and it doesn’t, that’s not automatically exploitable unless the programmer has actually made that mistake. If there wasn’t a promise in the first place, there would be nothing to report.

Unfortunately, CVE can’t see the difference between reporting a bug, or reporting a theoretical possibility of having a bug that never actually happened.


Frankly, I don't think industry is going to allow soundness issues to ever achieve parity of importance with what we think of as CVEs. There's simply too much code and mindshare around languages that don't worry as much about soundness.

I'm not saying it is right, just that concerns around soundness will likely be hand-waved away. And it is to everyone's detriment.


Or you can flip it around and ask why there isn't a body in the C and C++ community funding tangible advancements in the security and safety problem space the way the Python Foundation and Rust Foundation are.


I don't know about this. There are a lot of people being paid by their companies to work on the C++ committee and on various compiler teams. The author of the post we're discussing is a member of the committee and this article is an attempt to improve the security situation in C++.

So funding isn't the issue. The issue is that the committee has a firm commitment to never introducing breaking changes. This commitment is so firm that it trumps literally any other interest, like making the language memory safe. That's why the author only suggests non-breaking changes in this article.

Lots of people are going to have opinions on whether this approach is the right one for C++'s long term success, but I think we'll only know in 5-10 years.


This :

> So funding isn't the issue.

does not follow from :

> There are a lot of people being paid by their companies to work on the C++ committee and on various compiler teams. The author of the post we're discussing is a member of the committee and this article is an attempt to improve the security situation in C++.

The people working on the C++ committee are mostly working on their own time. Specific project directly funded by companies are actually quite rare. And those mostly focus on companies very immediate needs.

If someone wanted to commit let's say 25$ million over 5 years, i am sure that both C++ standards and the major implementations would make large jump in term of safety.

> The issue is that the committee has a firm commitment to never introducing breaking changes. This commitment is so firm that it trumps literally any other interest, like making the language memory safe.

Yes C++ and its committee have very strong commitment to backward compatibility. However, that's not the reason for not wanting to make C++ memory safe : from what i understand, the committee decided that the tradeoff are not worth the gain, and as Hurb explain in this article, between tooling and reasonable default, it possible to achieve pretty good level of safety in practice.

> That's why the author only suggests non-breaking changes in this article.

Just to repeat my point here: no, Hurb is suggesting non-breaking changes because of the aforementioned commitment to back-compat. Even if breaking changes were to be introduces, they would most likely NOT be to make C++ a memory safe language whole sale


> as Hurb explain in this article, between tooling and reasonable default, it possible to achieve pretty good level of safety in practice.

This is the main contention. For those who believe in the technical leadership of the committee, this feels like a reasonable way forward.

I'm sure in theory these issues can be tackled, it's just that in practice the C++ community has always chosen performance over any other concern (https://research.swtch.com/ub). In that context, where you can change the APIs but you can't change how a whole community behaves, these articles by Sutter and Stroustrup feel like a Hail Mary play to address the valid concerns raised by multiple organisations around memory safety. I think we'll find out in 5 years if their optimism was well founded.


> it's just that in practice the C++ community has always chosen performance over any other concern (https://research.swtch.com/ub).

I don't know how much credence to give to this idea. It seems to me this specific this critic always comes from (designer of) languages with very different goal in mind.

If you ask a die hard C fan, they will point to things like exceptions handling, constructor and even type conversion as example where C++ is not making design decision purely on performance. It seems that from it's inceptions, modelisation power and type safety where paramount to C++ and it's design, right along performance.

In particular here, it hard for me not to read simply in the article you linked that the authors doesn't like the compromise that the C++ committee has chosen... Notthing really objective.

> you can't change how a whole community behaves, these articles by Sutter and Stroustrup feel like a Hail Mary play to address the valid concerns raised by multiple organisations around memory safety.

Maybe i have a different read of the situation. But we are talking about the same community which produced C++11, introducing both a memory model, thread semantic and lambda functions. And since then have produce pretty significant advance in the language every 3 years. The C++ community is more deliberate in its approach to solving issues, taking longer to make sure that the proposed solution actually address the correct problem.

> I think we'll find out in 5 years if their optimism was well founded.

True.


> That's why the author only suggests non-breaking changes in this article.

One of the proposals in the article is to change the meaning of things like "if (a != b > c)" and "if (0 <= index < max)". That changes program behavior and therefore is a breaking change (for instance, the code might accidentally be depending on the "wrong" results of these odd comparisons, and "fixing" them makes it go through an untested path which does the wrong thing).


It's ok if the current behaviour is changed into a compiler error


I agree; while changing it into a compiler error could be considered "breaking" (it no longer compiles), it's not a silent break and forces the developer to fix the code. (I would only worry about developers doing the "obvious" fix to shut the compiler up without looking at the surrounding code to see if it was masking some other bug.)


That feels imprecise to the point of rendering the entire conversation useless. If every major C++ compiler shipped a copy of `rustc` renamed to `g++` or `clang++` or whatever, that would also make every breaking change a compiler error, but I don't think that's what anybody is talking about here.


On the contrary, it's the absurd claim that making "if (a != b > c)" a compiler error is somehow remotely comparable to changing C++ syntax so that it's the same language as Rust is what renders the entire conversation useless and is not what anybody is talking about here.


> On the contrary, it's the absurd claim that making "if (a != b > c)" a compiler error is somehow remotely comparable to changing C++ syntax so that it's the same language as Rust is what renders the entire conversation useless and is not what anybody is talking about here.

My point is that "it's okay to turn something into a compiler error" is vague and I don't know where the line is actually drawn. I don't think the boundary between acceptable and unacceptable breakage is obvious, which is why I explicitly gave an example that I knew for sure was outside it. I don't think the idea that reasonable people might disagree about what level of breakage is acceptable particularly radical, so I think it's worth not immediately assuming I'm participating in bad faith.


This is the sort of ideological thinking that holds back progress and it's something people have complained about in particular with the C++ standardization process.

You always have these purists who think that because a solution doesn't solve the problem for every single use case, that we can't put forth solutions that solve the problem for 90% of use cases. The entire article that Herb Sutter is writing is really a push to fix the 90% of safety problems in C++ without coming up with an ideologically pure solution that tries to solve all of C++'s safety problems.

If someone puts forth a proposal that a fairly awkward and easily misunderstood and error prone expression like "bool > bool" should produce a compiler error, and your response is that if we do that, we may as well just change all of C++'s syntax so that one could rename rustc to g++ and it just works, then you are participating in bad faith as opposed to presenting a sensible argument that reasonable people can actually discuss and make some kind of meaningful progress.


> If someone puts forth a proposal that a fairly awkward and easily misunderstood and error prone expression like "bool > bool" should produce a compiler error, and your response is that if we do that, we may as well just change all of C++'s syntax so that one could rename rustc to g++ and it just work

My first comment gave and example followed by saying "I don't think that's what anyone here is talking about", and my most recent response reiterated that I considered my example as explicitly being outside the boundary of what anyone would consider acceptable. It feels like you're going through great lengths to try to present it as something I actually recommended when I've been quite clear that I don't think it's anywhere close to reasonable. I've been quite clear that I'm not proposing anything; on the contrary, I'm _asking_ about how to decide whether something is a breaking change that's worth it or not because I'm not at all an expert in C++ and I don't pretend to be.

The only "ideological thinking that holds back progress" due from "purists" going on here is your insistence that I should be disqualified from asking questions because I happened to try to use a hypothetical example that you didn't like.


Why would you go from the proposal in the article that involved making a confusing expression like "bool > bool" a compiler error, to the absurd example of suggesting that C++'s syntax be changed to be identical to Rust?

How could talking that way possibly be conducive to having a serious discussion about the article, which is trying to eliminate 90% of the safety issues in C++?

If that is the manner in which you think a reasonable discussion can be had on this issue, then no, you are no qualified to discuss it and your participation has done nothing and continues to do nothing but derail the topic which is likely why no one has bothered responding to you.


that's how you get companies to stop upgrading and eventually end up sitting on a 20 y/o version of C++.

2nd and 3rd order thinking is a thing.


If a company won't update because it really needs to depend on whether one bool compares greater to another then by all means they can stick to using 20 year old version of C++.

Other companies with modern engineering disciplines that don't write hacky code like that can benefit from a sane and sensible compiler instead of being dragged down.


If you can't understand how the expense of doing that may be onerous on a business then you shouldn't be let anywhere near decision making.


Too late for that, I am in a decision making position at a quant firm with very strict engineering standards and I absolutely stand by my decision that businesses that write code that compares booleans together like that should not be in a position to hold back other businesses that don't.

They can continue using 20 year old compilers and quit making the language worse for the rest of us who have put in the effort and cost of writing modern software.


It's always easy to make a decision when you're not the one paying the cost for it, or don't imagine you will be.

In fact, one of the red flags for decision makers is the inability to understand the above tenet.


I agree, asking everyone else to pay the cost of writing error prone code because they refuse to adapt but yet feel entitled to use new compilers is a big red flag and poor technical decision making that offloads the cost on the rest of the community.

I'm glad we managed to get that out of the way.

Companies that wish to stick with their existing and deprecated coding standards can stick to their existing and deprecated compilers, allowing those of is who wish to have safe and modern tools the freedom to make progress without their baggage holding us back.


oh snap guys, do you see what he did there in his parley? The way he took my point and pretended I was saying something else and that I really agreed with him. That technique so got me that he won!

This is most definitely the paragon that should be helping us decide which large swathe of people to fuck over.


The example given here is a different interpretation of a statement.

The source would stay the same for correct and incorrect use,there is no way for the compiler to catch this and send an error message.


It's certainly possible to make an expression of the form "bool > bool" a compiler error and require that developers rewrite it as "x && !y"


But they do break stuff.

I think the commitment is more that you can’t do anything to the language that would lead to some compiler hacker who also serves on the committee to have to remove their pet optimization, regardless of whether that optimization is worth much (or anything).

Goals and messaging matter. I like that the Rust community aims for safety as a P1 goal. C doesn’t, so C doesn’t get it.


> I don't know about this. There are a lot of people being paid by their companies to work on the C++ committee and on various compiler teams.

The standards document isn't the same as a C++ implementation. The implementations are actually behind the standards, at least given current contribution levels.

There are relatively few people making significant contributions to C and C++ in compilers. Funding for middle and backend features to compilers are a lot easier to justify and fund because it looks like more straightforward optimization payoffs.

A lot of the effort put into C++ implementation work goes to keeping up with the pile of features added to the C and C++ standards every three years or so.


His 98% figure sounds about right, doesn’t it? Eliminate 98% of C++ CVEs and that’s enough to compete with memory safe languages.


Is aiming for exact parity the best way to achieve that though? I'm skeptical that memory safe languages have 98% fewer CVES than C++ because their goal wass "have 2% of the CVEs that C++ has" and they succeeded and not "try to have no CVEs whatsoever" and they failed by a small amount.


The most recent data I could find, from 2019, shows that 17% of CVEs are in PHP code, 12% in Java, and 11% in JavaScript - all memory-safe languages.

Memory safety bugs are only one class of security vulnerabilities, and there's nothing magical about memory safeness that causes a developer using that language to suddenly become expert at writing secure code.

For that matter, some vulnerabilities actually seem much more common in memory-safe languages - try searching the CVE database for "SQL injection c++" (287 results) vs javascript (3420), Java (2160), or PHP (11300).


Except the CVEs in Rust are far more likely to be low severity issues that C++ would never acknowledge to begin with.


It would still be worthwhile to greatly reduce the number of vulnerabilities coming out of new C and C++ code, which are likely to be with us for a long time still. At the very least as updates/fixes to existing codebases.


Yes, no doubt - reducing the number of vulnerabilities is a good thing. What I'm worried about is that they merely reduce the number of CVEs, and call it a win for their safety initiative. It becomes a PR exercise more than technological improvement.


Well, pay attention and hold them accountable.

But Microsoft (for instance) certainly has incentives to avoid being the next Boeing or Volkswagen with respect to being excellent box checkers that end up missing the mark on the outcomes those checkboxes are supposed to protect against. It doesn't matter if C and C++ have fewer CVEs as such if Microsoft tools and platforms gain a reputation as being insecure or unsafe.


Sounds about right because he is using the number of eliminated CVEs, not the number of remaining CVEs. Compare with 98% and 99.8%; even when we have been tracking all possible CVEs, 98% elimination will leave 10x remaining CVEs than 99.8% elimination. (Of course both are much better than 0% elimination, i.e. status quo.) I feel the true figure would be somewhere around 99.8% and 99.98%, especially when a massive undercount from existing C/C++ projects is accounted for.


He addresses this point in the article as well, discussing risks that that language community gets too fixated on one kind of safety while attackers shift to other concerns like supply chain, code injection, or leaked credentials.


That sounds like a situation I'd describe as "a good problem to have". Almost completely eliminating a comment attack vector and having to figure out how to pivot to stamp out another type is far better than just not making any significant progress at all. If the concern were that they were already making progress on some of those other security concerns and they were worried that switching focus would risk gains they expect to make soon in those areas, I think that would be reasonable, but that doesn't seem to be the case. Is there any evidence that this concern is anything other than hypothetical? It comes across more as an attempt to justify not spending effort improving memory safety rather than something that anyone is actually concerned about.


Of course, but memory unsafety is one of the biggest enabler for security vulnerabilities. Attackers have shifted to other concerns when PHP became widespread enough for example, but PHP alone cannot be used to create a vulnerability common in C/C++ because of its memory safety (which is not even that good!), you need some other systems to combine multiple issues into a single coherent attack, and it's likely that C/C++'s memory unsafety has played a great role somewhere in between. He doesn't fully acknowledge this multiplicative aspect of memory (un)safety.


That seems far fetched to me. The vast majority of zero-days are still memory safety issues, and it would be an absolute miracle if the C++ community can get that ship turned around in under 20 years. Even 50 seems unreasonable honestly.


But how do you know that the language has eliminated 98% of vulnerabilities?

98% of what? CVEs? Something else?


Absolutely not. Software that has 98% fewer memory safety issues is…still exploited. People just look harder and the costs go up a bit.


No, there is in fact a qualitative difference between a program where the expected number of CVEs is 1, and one where the expected number of CVEs is 0.02.


Yes, there are fewer CVEs. So?


Are you being purposely dense?

If the mean number of CVEs is low enough, some proportion of software has 0 exploitable flaws, and is invulnerable regardless of how much attackers spend.


I consider that most software that people use is sufficiently complex enough that it will not fall in this bucket.


What if you normalize for the amount of code in the wild? Relatively little software I use day to day have significant amounts of rust in them (only Firefox and my phone OS, probably...). Even in firefox, there is far more C++ than rust...


You’d also have to define what “use” means. Your traffic that goes through Cloudflare hits Rust code: does that count as a “use”? Sites you use that rely on AWS almost certainly hits Rust code, does that count as a “use”?

I think it’s kind of stretching it, but these are services where issues in them could impact you, so I don’t think it’s a total non sequitur.


This is true - and also misses one of the points Herb Sutter's article is making. I don't disagree at all with what seems the general sentiment here about the importance of memory safety. I also, though, don't disagree with Herb Sutter that there are other safety-relevant aspects in both programming and software deployment which aren't helped/prevented "merely" by using memsafe languages.

Say, the "typical" rust laziness ... just unwrap() because well we know for sure it can't possibly be None, right ? Do that in a form of crit code path, and while that may not open you to an exploit, it'll still down your service and damn you to a crashloop.

Yes, we should be using memsafe languages. Yes, we should be a little humble about bugs we may create. As important as it is to entirely eliminate one "critical" class, as important it is to realize even with that gone, bugs/issues/security problems will remain.


> As important as it is to entirely eliminate one "critical" class, as important it is to realize even with that gone, bugs/issues/security problems will remain.

Sure. Nobody believes that memory safety is the sole security issue. Or at least, no serious people, or any of the organizations doing advocacy around this issue.


And just to be clear, in C++ a "bad unwrap" could become an exploitable gadget in arbitrary code execution or bypassing security checks or literally any really serious issue. Also, the same issue would show up as any one of N failures and may even be missed.

In Rust it would manifest as a DOS attack vector and the issue would be blamed to the bad unwrap 100% of the time.

So the C++ case the possible failure modes are arbitrarily serious vulnerability with poor observability and difficulty finding it. In Rust that failure mode is a mild generally non-exploitable vulnerability* that 100% of the time fails in the exact same way making monitoring & detection trivial.

* Yes, it could be a single stage of an exploit where you take down 1 service which opens up another vulnerability. But that's still a more expensive exploit (in terms of $ to discover) than if you had this problem in C++.


> Say, the "typical" rust laziness ... just unwrap() because well we know for sure it can't possibly be None, right ?

Which is safe.

> Do that in a form of crit code path, and while that may not open you to an exploit,

Oh, so you realize that it is safe.

> it'll still down your service and damn you to a crashloop.

So there isn’t an argument here.

Next.


"memory-safe" != "safe".

Make a fair assessment what it means to your app / service when you have a, however controlled/contained, reproducible unexpected (by the developer) exit. Or what it means to use a hardcoded default (unwrap_or). Or what it means to pass up an Err via "?". Or to map an Err to None.

My argument is simple: The memory safety of rust is no reason to become arrogant as a programmer. It should, in fact, maybe make you more humble - what you learned about your own typical mistakes as you learned to write rust and tackle the borrow checker and decode clippy's extensive litany of sins in your source. And to consider that rust, as prime example, exists because people learned - from mistakes. From those design flaws in C/C++, namely.

Assume you make mistakes as well is likely to turn you into a better person. Definitely into a better programmer. No matter which language you use. Hopefully rust (on that I fully agree)


I will try to be more humble. Thank you.


A DOS attack is still a security vulnerability, but as I describe above it's still a better failure mode than you get with C++.


> it'll still down your service and damn you to a crashloop.

Nope. Your service will be structured such that it is subdivided in tasks, each task being wrapped in a `catch_unwind`[1], such that a panic merely kills the task it occurs in, not your service.

[1]: https://doc.rust-lang.org/std/panic/fn.catch_unwind.html


that's totally fair, certainly there are web servers I use that are probably using rust somewhere, but I suspect it's still a relatively small amount of code (even if widely used!).


> Take for example CVE-2022-21658 (https://blog.rust-lang.org/2022/01/20/cve-2022-21658.html) in Rust, related to a filesystem API. It's true, this was a CVE in Rust and not a CVE in C++, but only because C++ doesn't regard the issue as a problem at all.

That just plain wrong. Just simply wrong. And I hope it is not a lie done on purpose.

The C++ community acknowledge the issue as soon as the Rust one posted the problem and issued a fix which is already deployed with major compilers [^1] [^2]

It does not have a CVE associated since the issue was spotted within Rust stdlib first.

This is this exact kind of FUD and zealotism that makes people hate the Rust community. I wish the community mature a bit on this aspect.

[^1]: https://github.com/gcc-mirror/gcc/commit/ebf6175464768983a2d...

[^2]: https://github.com/llvm/llvm-project/commit/4f67a909902d8ab9...


> It does not have a CVE associated since the issue was spotted within Rust stdlib first.

I don't see why this is true. Are you saying that people with affected code would have seen a Rust CVE and then updated their C++ toolchains? There seems to be no reason this shouldn't have been a C++ CVE other than the fact that C++ community has different standards for what constitutes safety. The lack of CVE associated with the fixes you pointed out support the original assertion rather than refuting it.

In fact, I'll tell you why there was no CVE for C++ - concurrent access to filesystem APIs is undefined behaviour in C++ (https://en.cppreference.com/w/cpp/filesystem).

Reasonable people can disagree on this though, so I can see where you're coming from. There's no reason to immediately fling around accusations of lying and zealotry. It makes the writer look immature.


how would you define concurrent access to a filesystem?

That's a serious question, if I open a file for reading and another process writes to it, exactly how is the C++ standards supposed to protect against that?


> There seems to be no reason this shouldn't have been a C++ CVE other than the fact that C++ community has different standards for what constitutes safety

A security report has been filed for both compiler and actions taken for both of the major toolchain. This is a sign of mature security processes in used by both of the major C++ compiler implementations.

CVE are one among many way to address security vulnerabilities, one way that is currently heavily under criticism [^1]

> In fact, I'll tell you why there was no CVE for C++ - concurrent access to filesystem APIs is undefined behaviour in C++

The vulnerability reported (even in case of the Rust CVE) has nothing to do with a concurrent usage of the API. I think you do not really know what you are talking about here.

> Reasonable people can disagree on this though, so I can see where you're coming from. There's no reason to immediately fling around accusations of lying and zealotry. It makes the writer look immature.

My definition of immature includes the fact of throwing false statement over the internet, getting them refuted with sources and quote included. And still stance on them. This is a sign of immaturity.

[^1]: https://portswigger.net/daily-swig/cvss-system-criticized-fo...


The issue entirely hits that undefined behaviour. The problem is a race condition in symlinks -- a race condition implies a change, which is concurrent access, and the C++ standard clearly states any change to the filesystem by another program leads to undefined behaviour.

Sure, the "concurrent access" is another program, but the standard says another program changing an object you are accessing leads to undefined behaviour -- which does make, writing basically any C++ program that does filesystem access on a modern OS with other programs running completely impossible according to the letter of the standard, so I'm really not sure why it's written in that way.


No it isn't, it's just a bug and it was fixed, just like it was for Rust. Nobody hid behind UB for this and the time from reporting the issue to fixing it was about 2 weeks for both libcxx and libc++

https://bugs.chromium.org/p/llvm/issues/detail?id=19

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104161

Absolutely zero mention or attempted defense of "hurr durr but UB says we can do this!!!"


C++ is the standard. Not the implementations. The bug is fixed in the implementations, but remains in C++.


Where is the bug in the C++ standard? std::remove_all states:

> 1,2) The file or empty directory identified by the path p is deleted as if by the POSIX remove. Symlinks are not followed (symlink is removed, not its target). > 3,4) Deletes the contents of p (if it is a directory) and the contents of all its subdirectories, recursively, then deletes p itself as if by repeatedly applying the POSIX remove. Symlinks are not followed (symlink is removed, not its target).

Nowhere in there does it say "lol but FUCK YOU if you're on a multiprocessing system lololololol" or anything remotely close to that.


> 29.11.2.3 File system race behavior [fs.race.behavior]

> 1 A file system race is the condition that occurs when multiple threads, processes, or computers interleave access and modification of the same object within a file system. Behavior is undefined if calls to functions provided by subclause [fs.race.behavior] introduce a file system race.

> 2 If the possibility of a file system race would make it unreliable for a program to test for a precondition before calling a function described herein, Preconditions: is not specified for the function. [ Note: As a design practice, preconditions are not specified when it is unreasonable for a program to detect them prior to calling the function. — end note ]

There you go. "Behavior is undefined" is essentially "fuck you, you won't get any error you'll just get garbage at runtime". It does thankfully allow the implementations to make it an error at runtime (as they did), but does not require it, so the standard still has the bug.


The answer to that is there: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n40...

It is mainly wording around specifying that the result of a concurrent access can not be guaranteed. Which here Rust is no different, it just does not have a specification for his stdlib (yet)


I disagree, while rust doesn't have a formal specification, they would consider any crashes in safe code caused by parallel filesystem access to be unacceptable, while for years the C++ committee has been happy to say "You fool, you invoked undefined behaviour. Game over". I don't see any evidence from looking at the standard this bit of undefined behaviour is somehow "less undefined" than any other bit of undefined behaviour.


> I think you do not really know what you are talking about here.

Yeah, maybe.

The reference says

> The behavior is undefined if the calls to functions in this library introduce a file system race, that is, when multiple threads, processes, or computers interleave access and modification to the same object in a file system.

My understanding of the vulnerability happens because a different process interleaves access to the same object in the file system between the time of check (TOC) and time of use (TOU) leading to the TOCTOU bug. And this isn't a vulnerability in C++ because it's considered UB by the spec.

I'd be really interested in hearing you explain why this section of the spec isn't relevant to the TOCTOU bug. I'd learn something new if that's the case.

I know you're frustrated by the discussion of CVEs but please understand that the entire article and this thread is based on understanding C and C++'s security record based on CVEs. Herb Sutter compared C++ with Rust, pointing out 61 CVEs in C++ and 6 in Rust. Therefore it's really relevant to compare the attitudes of both communities about filing those CVEs. The fact that the exact same vulnerability warranted a Rust CVE but not a C++ CVE is quite telling.

I'm not happy with the CVE system, no one is. You're not either. All I'm saying is, comparisons of CVEs across languages, like Sutter has done, aren't helpful or useful. He claims that C++ will be equally safe as Rust if CVEs were reduced by 90%. But as long as the C++ community doesn't file CVEs when Rust does, that's not correct. And it's likely to be more incorrect if the community follows through with Herb's suggestion to take control of the CVE filing mechanism.


> My understanding of the vulnerability happens because a different process interleaves access to the same object in the file system between the time of check (TOC) and time of use (TOU) leading to the TOCTOU bug. And this isn't a vulnerability in C++ because it's considered UB by the spec.

The definition as Undefined Behaviour in the spec is quite unfortunate and a mistake. We do agree on that.

> I'm not happy with the CVE system, no one is. You're not either. All I'm saying is, comparisons of CVEs across languages, like Sutter has done, aren't helpful or useful.

On this I do agree too. And saying that a lot of communities have things to learn from the Rust core team about security issue handling is a completely fair statement.

What infuriated me was your point:

> but only because C++ doesn't regard the issue as a problem at all. The problem definitely exists in C++, but it's not acknowledged as a problem, let alone fixed

Which tend to under-mean that the C++ community took no action regarding to this exact problem. When, in fact, it is already patched and released in all major implementations (exactly like Rust did).


Yeah you're right, I originally said no action was taken because I was going off of my recollection of the original issue 2 years ago. When Rust released this blog post (Jan 20th 2022) the same issue had been fixed already in Python. When I asked around about C++, people pasted the reference link and said it didn't need to be fixed because it was defined as UB. I stand corrected, they did fix it. And good on them for fixing it and not closing it as "spec says its fine".

It's still bad they didn't file a CVE for it, knowing that other languages did so. It reduces trust in their ecosystem.


> It's still bad they didn't file a CVE for it, knowing that other languages did so. It reduces trust in their ecosystem.

Curiosity question: Do you know if python has also a CVE for this exact same problem ? I am not able to find it back through their git history.


In my understanding, no. I believe it was bpo-4489 [1], and I couldn't find a matching advisory from the PSF's database [2] which should contain most historical advisories as well (it does seem to miss earliest advisories like PSF-2005-001 and PSF-2006-001 though).

[1] https://github.com/python/cpython/issues/48739

[2] https://github.com/psf/advisory-database/


I was pointed multiple times by people to the C++ standard, which clearly states (when introducing the filesystem library):

"The behavior is undefined if the calls to functions in this library introduce a file system race, that is, when multiple threads, processes, or computers interleave access and modification to the same object in a file system."

and was told that made this bug not a compiler issues, but just undefined behaviour, exactly as if you'd written an array out of bounds or dereferenced an invalid pointer, the compiler can do anything it likes if another program changes the filesystem while your program runs.


> The behavior is undefined if the calls to functions in this library introduce a file system race, that is, when multiple threads, processes, or computers interleave access and modification to the same object in a file system.

There is a lot to bet that this has been added for portability reasons. The POSIX atomicity guarantees on file operations are not provided on every system.

The facts are, when this issue came, it has been treated as it should have been. This is, once again, a sign of mature security processes and behaviour regarding the compiler implementers.


> It does not have a CVE associated since the issue was spotted within Rust stdlib first.

That is blatant nonsense. Even if the vulnerability is similar or identical, CVEs are submitted for every affected project. If it were not there would not have been a bounds-check CVE in the last 35 years. The only situation in which that might not be the case is if the vulnerability is in an upstream library, but even then you often get a CVE in both upstream and downstream (or a CVE shared between multiple products) e.g. the libwebp 0-day from late 2023 got a CVE for Apple’s various OS (two in fact) and a shared CVE for libwebp and chrome, mozilla used that as their upstream CVE in emitting a security advisory.

The CVE-2022-21658 only covers the Rust standard library, that is not upstream of either libc++ or libstdc++, and neither fix references it anyway.

The GP might have gone a hair too far in saying that C++ “does not consider it a problem at all”, but they’re correct that C++ compiler/stdlib maintainers do not consider it a vulnerability.


> The GP might have gone a hair too far in saying that C++ “does not consider it a problem at all”, but they’re correct that C++ compiler/stdlib maintainers do not consider it a vulnerability.

No this is also just plain wrong.

It was reported to both compiler through channels dedicated to report security vulnerabilities and has been fixed as such.

The fact it did not make his way through a CVE is mainly related to how CVE naming and reservation works, nothing more.


> The fact it did not make his way through a CVE is mainly related to how CVE naming and reservation works, nothing more.

Still.. that little detail makes comparing the numbers of CVEs for Rust and C++ skewed by an enormous amount.


> Many of the most damaging recent security breaches happened to code written in MSLs (e.g., Log4j) or had nothing to do with programming languages (e.g., Kubernetes Secrets stored on public GitHub repos).

I’m surprised Herb is so defensive here. He normally strikes me as level-headed but he’s arguing in bad faith here. There’s no way a language can prevent arbitrary code execution if the programmer intentionally wants to allow it as a feature & then doesn’t think through the threat model correctly or not managing infrastructure secrets (the latter btw is mitigated by Microsoft’s own efforts with GitHub secret scanning although there should be more of an industry effort to make sure that all tokens are identifiable).

But C/C++ is a place where 60-80% of the vulnerabilities are regularly basic things the language can mitigate out of the box. No one is talking about perfection. But it’s disappointing to see Herb stuck and arguing “there’s other problems and even if memory safety is an issue Rust has problems too”. The point is that using Rust is a step function improvement and provides programmers with the right tools so they can focus on the other security issues. A Rust codebase will take more $ to exploit than a C/C++ one because it will be harder to find a memory vulnerability which is easier to chain into a full exploit vs attacking higher level stuff which is more application specific.

EDIT: And language CVEs are a poor way to measure the impact of switching to Rust because it’s the downstream ecosystem CVEs that matter. I’m really disappointed this is the attitude from the C++ community. I have a lot of respect for the people working on it, but Herb’s & Bjarne’s response feels like an unnecessary defensive missive to justify that C++ is still relevant instead of actually fixing the C++ ecosystem problems (namely their standards approach is making them move too slowly to ever shore up their weaknesses).


Their defensiveness makes sense in the current context where in the last few weeks organisations like the White House [1] and Google [2] are explicitly calling out the importance and imminent need of moving away from memory unsafe languages like C and C++. If everyone focussed on this one issue, it is possible that we might actually start moving away from C and C++ in the next 5-10 years.

Sutter pointing out that memory safety isn't the only vector for system vulnerabilities would have the effect of spreading cybersecurity efforts and budgets across all of them. In that case memory safety isn't the foremost problem it's being portrayed as, and it isn't worth migrating away from C and C++.

[1] - https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-...

[2] - https://research.google/pubs/secure-by-design-googles-perspe...


The White House reports also acknowledge that memory safety isn't the only issue, but instead, is a large one that movement can be made on. From page 8 of that report:

> To be sure, there are no one-size-fits-all solutions in cybersecurity, and using a memory safe programming language cannot eliminate every cybersecurity risk. However, it is a substantial, additional step technology manufacturers can take toward the elimination of broad categories of software vulnerabilities.

And of course, Google as well.

Not even the most fervent memory safety advocates believe that it is the sole thing that will make software secure, so arguments like this come across a bit strange.


No I completely agree, fixing memory safety is not the only thing that needs doing, far from it. I agree with the White House report in particular, which spends time talking about the responsibilities of C suite execs in ensuring their software is vulnerability free. That's good, actionable advice.

I called out those two reports because for the first time in forever, there's actual impetus to move away from C and C++. That challenges the standards committee's usual stance that the status quo is acceptable. That's why we see Herb Sutter actually engaging with the issue of memory safety here. Compare that with Bjarne Stroustrup's earlier glib dismissal of these concerns, where his talk started with "The Case Against Switching Languages". Kinda shows where his priorities lie.


But he’s not engaging with the issue of memory safety here.

> Since at least 2014, Bjarne Stroustrup has advocated addressing safety in C++ via a “subset of a superset”:

> As of C++20, I believe we have achieved the “superset,” notably by standardizing span, string_view, concepts, and bounds-aware ranges. We may still want a handful more features, such as a null-terminated zstring_view, but the major additions already exist.

Sounds like Herb too believes that C++ is making good progress and that it’s a library issue. This is problematic when the default `[]` API that everyone uses has no bounds check. So then you change the compiler to have an option to always emit a bounds check. But then you don’t have an escape hatch when performance is important.

Herb is always defending against switching away from C++ and that C++ will solve the problems in a back compat way. They’ve been disrupted and they’ve taken a classical defensive approach instead of actually trying to compete which would require a massive restructuring of how C++ is managed as a language (e.g. coalescing the entire ecosystem onto a single front-end so that new language features only need to be implemented once). They need to be more radical in their strategy but that doesn’t gel with design by committee.


Fedora & downstream build with -D_GLIBCXX_ASSERTIONS, which enables bounds checking for many of those operator[] calls (including std::vector). For tight loops, GCC can often eliminate the bounds checks, at least if you use size_t (or size_type) for the loop iteration variable, not unsigned.


> This is problematic when the default `[]` API that everyone uses has no bounds check.

The default [] API can be replaced with C++ classes that do bounds checks. C++ 20 provides the std::array class to do precisely that, and std::span to implement fat pointers.

All that's missing to implement the subset-of-a-superset mode is a compiler option to disable native arrays in C++ code (but not in extern "C" code).


> The default [] API can be replaced with C++ classes that do bounds checks

Which means you subtly break the performance guarantees of code which makes migration to a new version more annoying.

> C++ 20 provides the std::array class to do precisely that

What? https://en.cppreference.com/w/cpp/container/array/operator_a...

> Returns a reference to the element at specified location pos. No bounds checking is performed

> option to disable native arrays in C++ code

Yeah, no that's not the only place where bounds checking shows up. Lots of places use pointers as iterators because the language lets you. So even if you shut off those avenues, code that uses pointers as iterators would remain exploitable. Of course it's a step improvement, but there's just no way to close the barn door for C++ unless you sacrifice performance to such a degree that the obvious question becomes "why restricted C/C++" which still has a bunch of footguns, is slow, and which has a really inconsistent API and language surface?


He’s suggesting adding bounds checks automatically by the compiler, which is vastly more than Stroustrup was recommending. He reckoned merely running sanitizers was sufficient. He wasn’t even taking the problem seriously, as if it’s a given that the world will continue using C++ no matter what.

The fact that Sutter is willing to sacrifice performance for safety means at least he has woken up to the reality that the future may hold less C++ code than the past.


But without a way to recoup the performance when you need it, then C++ potentially becomes as slow as things like Go or Java with extra footguns and slower developer speed. That's why Rust has `unsafe` and `unchecked` API methods that you can use in unsafe to bypass bounds checks. And it's an extremely consistent API surface to deal with (not to mention a much better thread safety story which Herb hand waves away as "not important because other languages also have thread safety issues" even though he admits no one is as bad as C++ here).


I feel someone asserting this is the first time there's been an impetus coming from the government to move away from C and C++ must be entirely unfamiliar with the history of Ada.


People using Ada's failure are unfamiliar with the history of Ada, otherwise they would acknowledge that the reasons that it did not took off outside DoD weren't at all technical related, rather the hardware requirements (Rational started as a Ada Machine company), the price of the compilers, not being part of the UNIX SDK on the UNIX vendors that Ada compilers (it was an additional expense), the hacker culture against bondage languages (as usually discussed on Usenet),....


While they are similar, they are different: the move towards Ada was scoped purely at the Department of Defense. This situation is one where the government is also trying to work with and encourage practices in general industry.


I wonder how well that's going to work out. The software industry isn't exactly noted for taking technical advice from the White House...


In the request for comments before this was published, there was broad support from wide swaths of industry. Many organizations you've heard of are on board. Or at least, the public position of their companies are, I don't know how well that translates to the rank and file.

I have to write up a post about this...


Please post it here when you do. I'd love to read it.


In this case the problem's a little harder than that. The advice is good advice that the industry is already happy to give to itself, but less happy to actually apply when there's some cost or education required.


Rust has an industry and a hobbyist ecosystem, whereas I’m not sure if Ada ever had the hobbyists on board.


It's perhaps a small community, but there sure are hobbyists using Ada (it's not a bad choice of a language for certain applications). With GNAT, Ada is quite accessible even. See, e.g. https://pico-doc.synack.me/


Only after GNAT came to be, note that besides Ada Core, there are other six vendors still in business, with the typical defence contract prices, hardly easy to get hobbists that way.


> Not even the most fervent memory safety advocates believe that it is the sole thing that will make software secure

You must be new to HN, yes?

;0


I sure see a lot of people claim that others do advocate that, but I rarely if ever see anyone actually advocate for that, and if they do, it's not someone who's representing any of the organizations advocating for this issue. It's a "make up a guy to get mad about" kind of situation.


The Whitehouse report should scare the shit out of anyone invested in better software. People think the report is a step in the right direction, it's not.

You do not want the blob of D.C. putting their eyes on anything that looks like a bottomless money pit for consultants and self-proclaimed experts. Once that happens and is codified to some degree in law, it's very difficult to change or remove.

This potentially affects every government system in existence, and those systems are already some of the most legacy systems today. The US Navy still pays Microsoft to maintain support for Windows XP, so the idea this will happen in any less than a 25 year horizon is absurd. And even then, the dates can be extended. Why put a stop to the gravy train when it can keep going -- it's not like the public is even aware of just how enormous the federal government really is. Once you understand this an opportunity to extract billions of dollars from large organizations, you then have to ask what their lobbyists will do to change the laws in their favor to completely neuter the legislation codified into law.

I haven't see anyone even consider this highly likely, if not almost certain outcome.


I've read the report and it seems balanced and fair. I liked that they tackled how improvements could be made on many fronts, taking different approaches in each one. They didn't go overboard with any assertion or recommendation.

On one hand you're saying the problem is intractable and it'll take 25 years to solve. Then why are you criticising an effort to get the ball rolling?

You're frustrated by the Government using old, outdated and possibly insecure software. Then surely the White House exhorting the Federal government to fix these issues and procure software without issues is a good thing?

Of course any change is an opportunity for consultants to make money, but that doesn't mean the change isn't needed or that the White House is wrong for starting it.


Because there is no clear objective success criterion.

Further, once consultants are being paid, they're disincentivized to actually accomplish this incredibly broad and nebulous goal. It allows politicians to campaign on more secure computing, but never actually accomplishing anything except profiting from their spouse being one of these consultants. And by never solving the problem, it continues, which only further justifies spending more money to "fix" the problem.

You might call this cynical, it's not, it's realistic. I challenge you to find an example where this level of corruption isn't taking place, and explain what makes you so confident that will be the case for this specific issue, drawing specifically on areas of contrast.

If you can overcome the government corruption, you still have to overcome the lobbyists. You can't do both except in situations where the corporations are in cahoots with the government.


You are so terrified of government, and yet you don't recognise how powerful government actually is. You're scared of some overreaching law and excessive waste, but actually that's not what's happening here. This is the White House using its Bully Pulpit to effect change. That is at once more effective in this particular case (because a law forcing the use of a language would be unconstitutional) and less harmful (because people can choose to ignore it).


This is a gross mischaracterization. I would encourage you to re-read both of my replies so as to best respond to the points about how government involvement does not actually solve problems, but instead perpetuates problems because of institutional corruption.

Ad hominem attacks are not a counterargument to these inescapable facts, they're also against the community guidelines and do very little to persuade anyone to your position: https://news.ycombinator.com/newsguidelines.html


I’ve read far more coherent anti-government polemics than the one you’ve written. Those didn’t convince me, and I doubt re-reading yours will. They’re so greyed out I can barely make them out anyway.

What you’ve mistaken as an ad hominem attack was me trying to tell you - even though you’re concerned of what the government may do here by passing laws, they’re doing much more with much less effort. I’m surprised that you were unable to grasp that.


What exactly do you think this report accomplishes?

It's not legally binding like Congressional legislation or an executive order.

Nobody has been prevented from using memory safe languages prior to the report being published. I'm sure there are plenty of instances where consultants and contractors have been required to use C or C++ because it's written into hundreds of thousands of pages of antiquated government contracts, but this report isn't a magic wand that's going to change those contracts. You have to convince the most stuffy lawyers imaginable to change them, which is an expensive endeavor that no reasonable business is going undertake unless it affects their bottom line, and that only happens after Congress passes a law. And prior to a law, there's going to be an army of lobbyists ready to carve out a waiver system, render any hope of improving software quality moot.

I'm of the opinion it's merely a clarion call for D.C. parasites to invent ever-more creative ways to waste tax dollars. Without clear objective success criterion defined by the government, the problem will persist indefinitely. And if you're a bureaucrat with friends and family making millions consulting on this problem, you're disincentivized to solve anything. Why cut off the hand that feeds you.

I'm eager to hear your thoughts about these undeniable problems.


This is what I mean. I said you had no idea how the government gets things done and you took it to heart, quoting the HN guidelines and everything. The government doesn't need to pass laws to get its way. Think on that for a second - we're taught that changes can only be made by laws, and yet the government is doing something here that involves no law being passed, no regulation issued. A simplistic libertarian who distrusts government might view this as a simple waste of time, but it's actually an effective way to get things done.

Like I tried to tell you, this is jawboning. Here's an example of various elected officials using it against a social media company (https://knightcolumbia.org/blog/jawboned). It really works, which should scare you more.

Next, I'll assume you've read the report [1] in full, every page, like I did. But I'll add relevant excerpts that demonstrate that this isn't about starting some "War on Memory Unsafety" (my words), but rather encouraging the software industry to adopt better practices, at no cost to the taxpayer.

- Building new products and migrating high-impact legacy code to memory safe programming languages can significantly reduce the prevalence of memory safety vulnerabilities throughout the digital ecosystem.xi To be sure, there are no one-size-fits-all solutions in cybersecurity, and using a memory safe programming language cannot eliminate every cybersecurity risk. However, it is a substantial, additional step technology manufacturers can take toward the elimination of broad categories of software vulnerabilities.

- Formal methods can be incorporated throughout the development process to reduce the prevalence of multiple categories of vulnerabilities. Some emerging technologies are also well-suited to this technique.xxvi As questions arise about the safety or trustworthiness of a new software product, formal methods can accelerate market adoption in ways that traditional software testing methods cannot. They allow for proving the presence of an affirmative requirement, rather than testing for the absence of a negative condition.

Then it talks about the role the CTO, CIO and CISO can play in an organisation to improve cybersecurity readiness.

- The CTOs of software manufacturers and the CIOs of software users are best leveraged to make decisions about the intrinsic quality of the software, and are therefore likely most interested in the first two dimensions of cybersecurity risk. In the first dimension, the software development process, the caliber of the development team plays a crucial role. Teams that are well-trained and experienced, armed with clear requirements and a history of creating robust software with minimal vulnerabilities, foster a higher level of confidence in the software they produce.xxxvi The competence and track record of the development team serve as hallmarks of reliability, suggesting that software crafted under their expertise is more likely to be secure and less prone to vulnerabilities.

- A CTO might make decisions about how to hire for or structure internal development teams to improve the cybersecurity quality metrics associated with products developed by the organization, and a CIO may make procurement decisions based on their trust in a vendor’s development practices.

- The CISO of an organization is primarily focused on the security of an organization’s information and technology systems. While this individual would be interested in all three dimensions of software cybersecurity risk, they have less direct control over the software being used in their environments. As such, CISOs would likely be most interested in the third dimension: a resilient execution environment. By running the software in a controlled, restricted environment such as a container with limited system privileges, or using control flow integrity to monitor a program at runtime to catch deviations from normal behavior, the potential damage from exploited vulnerabilities can be substantially contained.

So you, unlike the people who never read the report, would know that this report was all about educating firms on ways that they can become more secure. At no point does it talk about what the federal government might or might not do. It doesn't involve any spending, any corruption, any laws, any lobbyists, anything that people scared of Big Government might worry about. Not a single dollar spent.

And already it is having results. A few days later, Google published a report that broadly agrees with everything the White House is saying, and talking about their implementation plan. Especially for the millions of lines of C++ in the most used software among regular people - Android and Chrome. [2]

[1] - https://www.whitehouse.gov/wp-content/uploads/2024/02/Final-...

[2] - https://security.googleblog.com/2024/03/secure-by-design-goo...


Virtually everyone wants to improve software and make it more secure. We're approaching it from different angles and it's being confused for disagreement on the topic.

There's a lot of value in having good faith discussion from each perspective so we can mitigate downside risk while enhancing the upside goal. I'm eager to hear your thoughts on the undeniable problems restated in my previous replies.

Google is not a good example -- they're technically competent and were already doing work in Rust. It seems like focusing on small software shops still writing C++89 code would be better thing to focus mental energy on. Are there any examples of those kind of businesses using this WH report to steer their roadmaps or technology directions?


I can explain it to you but I can't understand it for you.

I've tried to show you how jawboning works, but you're still steeped in a mindset where the government "undeniably" coerces through legislation and regulation.

> I'm sure there are plenty of instances where consultants and contractors have been required to use C or C++ because it's written into hundreds of thousands of pages of antiquated government contracts

Could you show some instances of these contracts? That's on you, I can't prove a negative.

You're imagining that there must be legislation requiring C++ and therefore it's impossible to get a change away from C++ by just talking.

> there's going to be an army of lobbyists ready to carve out a waiver system, render any hope of improving software quality moot.

Now you're beginning to understand why they didn't go with a coercive law or regulation. When people are coerced, they demand carve outs. When you ask nicely, like the White House have here, they may consider it. And there's nothing wrong with carve outs per se. For example, thousands of ships/planes and other systems are going to use Sqlite as their database and that's written in C. No sense in demanding a database in Rust because frankly, Sqlite is proven software, deploying on billions of devices in use today. It deserves a carve out.

> Without clear objective success criterion defined by the government, the problem will persist indefinitely.

Why would you think this is the last you're ever hearing of this? This problem would take a decade to solve, at the most optimistic. Why are you demanding a perfect solution on day one? All they've done so far is pointing out ways the industry can do better. Maybe next year they change the procurement criteria for some defence contracts. Maybe the year after that they change the procurement criteria for all government contracts. They can try different things, iterate on them.

They can look at the success of industry initiatives like https://memorysafety.org in a couple of years and see if that's something they should invest in themselves.

> It seems like focusing on small software shops still writing C++89 code would be better thing to focus mental energy on. Are there any examples of those kind of businesses using this WH report to steer their roadmaps or technology directions?

Are you asking if there are some small shops who have responded to this 3 week old report, completely changed the direction of their business and published a report about it? Even if they had, how would I have heard about it? Google's report reached the front page of HN and that's where I saw it, a small company would struggle to reach that kind of exposure.

Your clear distrust of government makes you unable to see that what they've done is a small, effective step in the long march towards improving software security. That's why you set impossible standards for them ("objective success criterion", "proof of small shops adopting it") and then immediately think you're correct when you see they fail to meet those standards. I can't change how you feel about government, so there's not much left to say. If you feel your "undeniable" points haven't been addressed, I'm not going to attempt it again.


Politicians are not campaigning on this at all. This is a niche topic that only impacts software developers. Nor is this setting out milestones for switching. It’s just advice saying “hey guys, consider other alternatives to C/C++”. It’s a social pressure - there’s no force of law behind this yet. And at most the government can only compel what their own vendors do.


So what else do you propose?


The table stakes here is automatic bounds checking. This is something that pretty much every newer language does already, and even several older languages figured out how to do well.

The problem in C/C++ is that pointers don't inherently communicate their bounds, so your options for adding automatic bounds checking are a) fat pointers and consequently (severe) ABI break; b) some sort of shadow memory to store bounds info (ASAN, generally considered inadvisable to use in production); or c) change the language to communicate what the bounds of a pointer are. The good news is that most interfaces will provide the bounds of a pointer as another member of the struct or the function parameter it's part of; the bad news is that actually communicating that information requires a scope lookup change that is hard to get through the committees.


Things like CHERI, Fil-C, and CCured make pointers just carry their bounds.

It’s not an unfixable problem.

I wish we were talking about fixing it, not making excuses.


The problem with fixes on things this low-level is that they carry the potential to break lots of code. Since broken code has to be fixed, you then get into the "why not just rewrite it in <insert new hotness here>?" argument, which is headed off by just not fixing it.

C/C++ maintainers knew this and didn't want to see their lives' work made less significant. Now the issue's been forced by (among other things) one of the world's most influential software customers, the US Federal Government, implying that contract tenders for software written in languages like Rust will have an advantage over those written in languages that don't take memory safety as seriously.


CHERI claims that the amount of changes are exceedingly small.

Fil-C is getting there.

So, C has a path to survival.

> The problem with fixes on things this low-level is that they carry the potential to break lots of code. Since broken code has to be fixed, you then get into the "why not just rewrite it in <insert new hotness here>?" argument, which is headed off by just not fixing it.

“Lots” is maybe an overstatement.

Also, if there was a way to make C++ code safe with a smaller amount of changes than rewriting in a different language then that would be amazing.

The main shortcoming of CHERI is that it requires new HW. But maybe that HW will now become more widely demanded and so more available.

The main shortcoming of Fil-C is that it’s a personal spare time project I started on Thanksgiving of last year so yeah


> CHERI claims that the amount of changes are exceedingly small.

Oh, man. Yes, they do. Many people have been claiming that for decades.

When can we expect one of them to claim it's done?

(To be fair, the amount of changes required has been diminishing through those decades.)


I think the hardest part about CHERI is just that it's new HW. That's a tough sell no matter how seamless they make it.


CHERI has hardware in the form of ARM Morello and CHERI RISC-V running FreeBSD, easily to check their claims.


CHERI is effectively a mix of option a and b in my categorization, necessitating hardware changes and ABI changes and limited amounts of software changes. I'm not familiar with the other options in particular, but they likely rely on a mix of ABI changes and/or software changes given the general history of such "let's fix C" proposals.

ABI breaks are not a real solution to the problem. When you talk about changing the ABI of a basic pointer type, this requires a flag day change of literally all the software on the computer at once, which has not been feasible for decades. This isn't an excuse; it's the cold hard reality of C/C++ development.

There is no solution that doesn't require some amount of software change. And the C committee is looking at fixing it! That's why C23 makes support for variably-modified types mandatory--it's the first step towards getting working compiler-generated bounds checks without changing the ABI and with relatively minimal software change (just tweak the function prototype a little bit).


Wouldn’t you have to recompile all your dependencies or run into ABI issues? For example, let’s say I allocate some memory & hand it over to a library that isn’t compiled with fat pointers. The API contract of the library is that it hands back that pointer later through a callback (e.g. to free or do more processing on). Won’t the pointer coming back be thin & lose the bounds check?


Compile everything memory safely and then no problem.


Fil-C sounds like an amazing project!

Do you have any guesses on whether it could easily target WebAssembly? I'd imagine many people would like to run C code in the browser but don't want to bring memory unsafety there.

link: https://github.com/pizlonator/llvm-project-deluge/blob/delug...


How much code out there does stuff to the effect of

  union MyObject {
    void* ptr;
    unsigned long data;
  }
  (...)
  MyObject obj;
  obj.ptr = (void*)some_function;
  (...)
  store_context(obj.data);
And what would happen to such code if pointers are suddenly fat?


CHERI handles that by dynamically dropping the capability when you switch to accessing memory as int.

Fil-C currently has issues with that, but seldom - maybe I've found 3 such unions while porting OpenSSL, maybe 1 when porting curl, and zero when porting OpenSSH (my numbers may be off slightly but it's in that ballpark).


The reason they don't communicate their bounds is also a performance optimisation. You can certainly do it in C++; use a std::vector for e.g. and use the .at() method to index on it and it'll throw an exception unless you disable that with a compiler flag.

The thing is, it's fine to take that risk if you're writing HPC simulation software, but it's much less fine if you're writing an operating system or similar.


The performance and power use cost to checking bounds is trivial!

Apple has tested this, on mobile devices even, when working on -fbounds-safety. From the slides:

    System-level performance impact
    • Measurement on iOS
    • 0-8% binary size increase per project
    • No measurable performance or power impact on boot, app launch
    • Minor overall performance impact on audio decoding/encoding (1%)
    • System-level performance cost is remarkably low and worth paying for the security benefit
Some more specific synthetic benchmarks suites reported ~5% runtime cost for bounds checking.

https://www.youtube.com/watch?v=RK9bfrsMdAM https://llvm.org/devmtg/2023-05/slides/TechnicalTalks-May11/...

Bounds checking being omitted due to performance is mostly a myth, the only time this should ever be believed is in very specific circumstances such as performance critical code and when the impact has actually been measured!


Whether it's trivial or not depends totally on the workflow. A 5% runtime cost can be enormous - when I was in academia I was running thousands of simulations on big clusters like ARCHER, some of which could take up to a fortnight to run. In those cases, a 5% cost can add a whole other working day to the runtime!


> Whether it's trivial or not depends totally on the workflow.

People here are talking about language defaults, and that the default should be safe, and while, yes, technically you can construe a workflow they're not going to work for, they work for most.

That doesn't prevent your ARCHER simulation from calling — hopefully only at sites that profiling indicates need it — .yolo_at(legit_index_totes) (or whatever one might call the method) & segfaulting after burning a few days worth of CPU time away.


Do you believe that is a common case, or an exceptional one?


I don't think it's particularly exceptional for the sorts of people that are still using C++ (and making a conscious decision to do so over Rust for e.g.).

If you're writing 'standard' C++ these days, you're probably already making use of std::array, std::vector, etc. anyway. The only area where people are working on modern codebases I've not seen so much of that is in HPC stuff and embedded.


Yeah, “also” a performance optimization.

It’s also just legacy. We’ve always done it that way so we still do it that way for ABI compat and because it’s hard to find a compiler that does it any other way.

Imagine if the story was: “you totally can have a bounds on your ptrs if you pass a compiler flag and accept perf cost”.

I bet some of us would find that useful.


> You can certainly do it in C++

You can do it in C as well, although it's a lot clunkier. I've been doing so for decades when the effort is appropriate to the task.


The problem is fixable in C++. std::span is the fat pointer; std::array is the checked array. All that's missing is a compiler option that gives warnings/errors when the legacy native [] features are used.

C is probably unfixable. But that's a different language.

Presumably compilers would allow conversion of spans to native pointer arguments when calling methods declared as "extern 'C'".


Visual Studio does exactly that, yet most devs don't care until the goverment steps in.


The problem is existing practice. GCC has solved this problem for function parameters a long time ago with parameter forward declarations. But other compilers did not copy this GNU extension, and also nothing else really emerged... This makes it hard to convince the committee to adopt it.

In structs there is no existing extension, but a simple accessor macro that casts to a VLA type works already quite well and this can be used by refactoring existing code.

There are still some holes in UBSan, but otherwise I think you can write spatially memory-safe C more or less today without problem. The bigger issue is temporal safety, so the bigger puzzle piece still missing is a pointer ownership model as enforced by the borrow checker in Rust.


> There are still some holes in UBSan, but otherwise I think you can write spatially memory-safe C more or less today without problem.

I wouldn't call it a solved problem until gcc and clang have an auto-inserts-bound-check flag that does the equivalent of a Rust panic on every array access if it's out-of-bounds, is considered usable on production code [1], and works on most major projects (that care enough to change their source to take advantage of this flag). Overall, the problem isn't so much that we don't know how to write safe C code, it's that the compiler doesn't quite have enough information to catch silly programmer mistakes, and there current situation is juuuuust bad enough that we can't feasibly make code that doesn't tell the compiler enough error out during compilation.

> The bigger issue is temporal safety, so the bigger puzzle piece still missing is a pointer ownership model as enforced by the borrow checker in Rust.

Temporal safety is interesting in part because it's not clear to me that there currently exists a good solution here. The main problem, like existing partial solutions for spatial memory safety, is that the patterns to make it work well are known, but programmers tend to struggle to apply all of the rules correctly. Rust's borrow checker is definitely a step up from C/C++, but at the same time, there are several ownership models that it struggles to be able to express correctly, even if you ignore the many-readers-xor-one-writer rule that it also imposes. Classic examples are linked lists or self-referential structs, but even something like Windows' IOCP can trip up Rust's lifetime system.

Although, at the very least, a way to distinguish between "I'm only going to use this pointer until the end of the function call" and "I'm going to be responsible for freeing this pointer you give me, please don't use it any more" would be welcome to have, even if it is a very partial solution.

[1] Don't get me wrong: the development of the sanitizers is an important and useful tool for C/C++, and I strongly encourage their use in test environments to catch issues. It's just that they don't meet the bar to consider the issue solved.


Sanitizers without runtime, i.e. -fsanitize=bounds -fsanitize-trap=bounds, can be used in production? And I think it can be used on existing projects by refactoring. Catching this at compile-time would be better, but Rust also can't do it and it is not needed for memory safety. And I think the solution C converges to (dependent types) actually would allow this is many cases in the future, while this is difficult without them. I fully agree about your other points.


The defensiveness is entirely understandable. There's a very vocal contingent of the industry who is increasingly hostile to anyone who dares to say that C/C++ isn't pure evil. Defensiveness is the natural reaction to that sort of thing.


Instead of defensiveness, why not talk about the ways in which the C++ committee is changing how they're operating (or even leaving ISO) and changing their culture to shore up these things.

Look at past Scott Meyers talks (e.g. [1]). He highlights how the committee has arbitrary set of principles that can be applied to justify or reject any proposal and the inconsistencies in the language are a reflection of this.

This isn't a problem of the language itself but rather that it's design by committee with 4 major front-end implementations (Intel, MSVC, Clang, GCC although I believe Intel is standardizing on the Clang front-end at least). Organizational issues like that are tough to spot but it's become clear now for several years that Rust is going to beat C++ silly if the C++ committee doesn't clean up their act and steer the C++ community a bit better (e.g. still no ABI specification, no improvements on macros, no standardized build system, modules are a joke, no standardized package system, etc etc). They're not effective stewards not least of which because they can't even take good impactful ideas from Rust and copy them.

[1] https://www.youtube.com/watch?v=KAWA1DuvCnQ


C and C++ could both easily add this feature:

https://www.digitalmars.com/articles/C-biggest-mistake.html

which is adding slices to the native language. This will eliminate buffer overflow errors (if the user uses that feature). D has had this from the beginning, and while it doesn't cover everything, that feature alone has resulted in an enormous reduction in memory safety errors.

BTW, D also has a prototype ownership/borrowing system.


C++20 added std::span, which is essentially that.


Without bounds checking.


Hey. Be reasonable. That's coming in c++26 using the `.at` interface that no one actually uses because `[]` is more natural, shorter (2.5x shorter), and convenient.


Yeah, that is exactly the problem. :)


> although there should be more of an industry effort to make sure that all tokens are identifiable

There is, BTW: https://datatracker.ietf.org/doc/html/rfc8959

Getting people to use the standard is another matter.


Sutter correctly points out that when using pointers in safe Rust, one is generally limited to tree structures. But, that one can express cyclic structures in safe code using other means (such as reference counting or integer indices):

> One reason is that Rust’s safe language pointers are limited to expressing tree-shaped data structures that have no cycles; that unique ownership is essential to having great language-enforced aliasing guarantees, but it also requires programmers to use ‘something else’ for anything more complex than a tree (e.g., using Rc, or using integer indexes as ersatz pointers); it’s not just about linked lists but those are a simple well-known illustrative example.

But then later on, seems to ignore the safe alternatives and commits a non-sequitur:

> That’s because a language’s choice of safety guarantees is a tradeoff: For example, in Rust, safe code uses tree-based dynamic data structures only. This feature lets Rust deliver stronger thread safety guarantees than other safe languages, because it can more easily reason about and control aliasing. However, this same feature also requires Rust programs to use unsafe code more often to represent common data structures that do not require unsafe code to represent in other MSLs such as C# or Java, and so 30% to 50% of Rust crates use unsafe code, compared for example to 25% of Java libraries.

In other words, Sutter acknowledges the safe alternatives at one point, but then ignores them later. When ignoring them, it allows Sutter to draw a conclusion as to why some percentage of Rust code uses `unsafe`. And even if ignoring those alternatives was appropriate here, I still see no reason to believe that 30%-50% of Rust crates use `unsafe` precisely because of the limitations around cyclic structures. There are many more reasons to use `unsafe`.

[1]: https://thenewstack.io/unsafe-rust-in-the-wild/


> 30% to 50% of Rust crates use unsafe code

This is particularly misleading for two reasons:

1) There are a lot of `-sys` crates which link to C libraries, so the "unsafe" code in these crates comes from the binding to C rather than some limitation of Rust. Often there are 100% safe Rust alternatives to these C libraries as well.

2) Of the remaining crates which use "unsafe", the unsafe code is often contained to a tiny percentage of the code, so if we're looking at the overall amount of unsafe code, you're going from 100%, to a fraction of a percent.


Correct.

By any fair measure, unsafe Rust is a tiny fraction of total Rust code.

In my experience, the community eschews it with a fervor bordering on religiosity.


> 1) There are a lot of `-sys` crates which link to C libraries, so the "unsafe" code in these crates comes from the binding to C rather than some limitation of Rust. Often there are 100% safe Rust alternatives to these C libraries as well.

Sure... but it's still unsafe code ... with the the consequences of unsafe code. Also doesn't also explain the 30 % to 50 % vs 25 % in Java. My guess would be that java just care less about perf. I mean if blaming a language "unsafeness" on C is an options, i submit that a lot of the unsafe part of C++ where imported wholesale from C.

> 2) Of the remaining crates which use "unsafe", the unsafe code is often contained to a tiny percentage of the code, so if we're looking at the overall amount of unsafe code, you're going from 100%, to a fraction of a percent.

Three points :

First, code volume is a good proxy only when the code samples are uniformly distributed : while the unsafe code might be a tiny percentage of the code base, they might represent a large percentage of runtime and/or a large part of the algorithmic complexity of the whole project.

Second going from 100 % to tiny fraction implies that C++ is 100% percent unsafe... and that's not the case. There is safe subset somewhere deep in there.

And third, "unsafe rust" is not safer than "regular C++" . Mixing those 3 factor makes the finally "safety" tabulation much more complicated than just counting number of unsafe regions


> Of the remaining crates which use "unsafe", the unsafe code is often contained to a tiny percentage of the code, so if we're looking at the overall amount of unsafe code, you're going from 100%, to a fraction of a percent.

I dislike this argument because rust unsafe code is typically placed into a module with safe code around it protecting it.

Guess how good C++ code is written?

exactly. The unsafe keyword certainly helps but is not a panacea nor a guarantee given that a bug in the safe code that's protecting the unsafe code could be the root cause for security issues, even if it manifests itself in the unsafe code.


C++ can't generally encapsulate safety. Rust can generally encapsulate safety. That's the essential difference. It's true that the boundaries of safety in Rust extend to the scope of the module in which `unsafe` is used, but C++ has no boundaries at all.

> but is not a panacea

Do you have a source to literally anyone credible saying Rust is a panacea?

> nor a guarantee given that a bug in the safe code that's protecting the unsafe code could be the root cause for security issues, even if it manifests itself in the unsafe code.

This just in. Rust doesn't guarantee bug-free code. Holy shit. What a revelation! Like, really? That's your problem with the argument? That it doesn't live up to the standard that bugs can't exist?

The value proposition of Rust has, is and always will be that it can encapsulate a core of `unsafe` usages in a safe API with no or very little overhead. The promise of Rust is that this power of encapsulation will lead to less undefined behavior overall. Not that it literally makes it impossible because, well, yes, you can have bugs in `unsafe`!

To head off the pedants, yes, not everything can be safely encapsulated in Rust. File backed memory maps are my favorite example. And yes, bugs in not just `unsafe` blocks but bugs in the language implementation itself can lead to UB.

And yes, Rust achieves this through a trade off. As Sutter mentioned. None of this should be controversial. But what Sutter misses is a fair analysis of this trade off IMO. He does a poor job at representing its essential characteristics.


I would have guessed most uses of unsafe dealt with c abi bindings. People do like to create interesting data structures in rust, but I don't see any of that in my cargo tree, and I suspect most people don't.


> But then later on, seems to ignore the safe alternatives and commits a non-sequitur

I think it's possible to read the argument in the way that make sense.

I don't think that Hurb is making the point here that Rust code uses more unsafe code pacifically "because" the the tree-base dynamic data structures restrictions. It seems that he took this simply as an illustration of a larger point.

The point he is making is that programming language safety is always a trade-off. Rust and Java are making different trade-offs which results in a different level/kind of safety.

We should also note, even if there are "safe" alternatives in rust, it doesn't mean those alternative are viable in term of code complexity, performance, cache behavior etc... etc... So the existence of safe alternative doesn't negate the need to use unsafe if one doesn't like the trade-off those safe alternatives imply.

I don't think it's a particularly big problem that more part of rust are unsafe vs Java, and probably reflect rust focused on native performance more than anything else.

But for the purpose the Hurb conversation here, this fact is important to note.


It's definitely a trade off. But Sutter is extremely selective in the analysis put forward on that trade off. Selective enough that I find the wording pretty misleading overall. And that particular paragraph is still a non-sequitur. The language used is pretty clearly "because foo, bar happened." Emphasis mine:

> However, this same feature also requires Rust programs to use unsafe code more often to represent common data structures that do not require unsafe code to represent in other MSLs such as C# or Java, and so 30% to 50% of Rust crates use unsafe code, compared for example to 25% of Java libraries.

This isn't a carefully considered statement about trade offs. This is sloppy reasoning. For example, if the reality were that 99% of that `unsafe` code were just FFI interactions with programming languages that are memory unsafe by default, then that would kind of undercut the entire point here. That is, it wouldn't really be Rust's fault. It would be the fact that we've lived in a memory unsafe hegemony in systems languages for the last several decades. That's just reality.

Of course, I'm sure it isn't 99%. But I'm also sure it isn't 1% either. The reality is so much more interesting than "30%-50% of all crates have `unsafe` in them." And that reality could very well undercut the entire point Sutter is making in this part of the article.


perhaps you should try the steelman technique rather than interpreting his words through a lense of negativity.


What's the overall steelman here? Something like, "Rust isn't a silver bullet. Let's make a subset of a superset of C++ that's safe." Okay, cool. Sounds great and the attempt at making C++ without breaking existing users certainly seems like an uncontroversially good thing to try. I don't really have any response to that.

I don't need a "lense of negativity" to criticize the details of Sutter's argument. This part of the article in particular would be a lot weaker overall if Sutter presented the reality than some sloppy non-sequitur. And that actually matters for his argument because it cuts to the heart of just how big of a trade-off Rust really is. If it isn't as big as he seems to be suggesting, then Rust's value proposition gets stronger. It's a critical flaw in his reasoning.

Steelmanning is great and we should try to do that. But I don't actually see a stronger version of Sutter's argument here. It's not just a phrasing issue, although the phrasing is certainly what jumped out to me. And I could just as easily say that the problem with Sutter's article is that he isn't doing a very good job of steelmanning the case for Rust. Whoop de do.


[flagged]


"do as I say, not as I do"


> When using pointers in safe Rust, one is generally limited to tree structures.

For the sake of completeness, it's possible to express certain cyclic graph patterns without overhead in safe Rust. But it's usually too much of a hassle to bother with. https://docs.rs/ghost-cell/latest/ghost_cell/


Thanks. I had forgotten about `ghost-cell`.


"How do I get to Dublin?" "Well I wouldn't start here"

I've been a c++ dev for 30 years. I'll be retiring in a decade (or perhaps keep going), and I doubt I'll ever see a 'safe' C++ in my working life.

If the committee were even truly serious they'd adopt epoch releases and feature flags in the language spec (see Circle, doing it for real), starting in c++26, the next release. And adopting tooling as part of the standard like every other language. As usual, C++ is the outlier here.

Then meaningful progress might start to be made in a measurable and testable way.

Instead I'll read another dozen articles like this for next 5-10 years, and any talk of tooling or epochs will be kicked down the road for another few years as usual.

But it's been a good lesson for all the other languages for better and worse.


IMHO, Herb Sutter was spot on in his talk (https://youtu.be/fJvPBHErF2U) that it is impossible to walk back features of a language once they are in the wild. His cppfront is an interesting and refreshing approach. I think his “create a TypeScript for C++”, ie. generate safe, easy to understand, conforming C++ is the optimal path for where we are. Dealing with legacy codebases to evolve the language into something else is monumentally more difficult. And I tend to argue that maintaining legacy C++ systems with decades of features and bug fixes is okay for the right applications today.


That's Hyrum's law, basically.

With feature flags and epochs, it is possible to evolve C++ in the right direction with the minimum of friction and make it opt-in in a consistent fashion. Circle already proves this a feasible approach, and that's a one-man project. There's also clang tooling that modernizes old C++, and huge amounts of work on LSPs and the like for refactoring, so it is possible to provide tooling that mechanically fixes-up a large portion of old code.


As much as I appreciate the intentions of those working on the successor languages, they have invariably already sunk the ship before even setting sail in my opinion. They are already too fragmented, Carbon, Cppfront, Circle and probably more. By fragmenting they destroyed the only viable future I saw for them. If you want the goals of those languages, there is Rust today that gives you that and more. I'm not alone with that opinion, I know several people in the C++ standard committee that think this way. Even fluent interop is a race they are loosing.


My leading comment - "How do I get to Dublin?" "Well I wouldn't start here"

All of the alternatives you (and I) mention - Carbon, Cppfront, Circle, aren't really fragmentation. They're all experiments at the moment, and all useful in testing out ideas in various different independent directions.

The only comment I'll make about Rust, is that it's going to do a C++, and inspire another generation of languages (e.g. Hylo) that will takes lessons from it and improve - perhaps a nicer syntax, or better type inference and generics, or more formal proofs baked into the language. It's not an end-point.

Can C++ be truly safe? I don't know, to me it seems totally at odds with where the language started, choices made and baked into the language, and where the very broad community and committee currently are.


I agree, Rust will certainly not be the end all language. The original language creator Graydon Hoare has interesting insights into what's next https://graydon2.dreamwidth.org/253769.html.

Regarding successor languages, I fail to see how they are innovating though. Their sales pitch as far as I know, boils down to "Look Ma we have Rust at home". Which I don't find very compelling.


How did Carbon, Cppfront, and Circle sink the effort for safer C++?

Did TypeScript, Flow, Closure, Elm, ReScript sink the effort for safer JS?


While your examples might sound like a similar situation, I'd argue they differ in significant ways. Web was and is the largest software delivery platform. From that perspective it's more akin to x86 machine code. For a long time it was the format you needed to output to partake in that ecosystem without alternatives. A more fitting comparison in my eyes is Kotlin for Java. And if there would have been a single project the community backed, I could have seen a similar situation arising. But as it stands, I don't see that happening.


Kotlin, Scala, Groovy, Clojure

(None of these looking to make Java safer per se, but rather more usable/less verbose.)


With cppfront, Sutter is explicitly not trying to create a successor language. His goal (like TypeScript to JavaScript) is to provide a complimentary, parallel language which provides complete forwards/backwards compatibility with C++. His hope is that (like TypeScript), the languages and communities evolve together. Any other technology that involves developing new tools, migrating code, or lacks completely frictionless interop will be inherently fragmentary. It reminds me of the classic XKCD comic: https://xkcd.com/927/


That is like saying C++ and Objective-C aren't successor languages to C, because their original implementations compiled to C.


That's a nice goal, and if the large majority of the community would back this one attempt it might work. But the presence of Carbon and Circle make that a pipe dream IMO.


In recent versions of C++, circa C++17 but especially C++20, it has become possible to build an alternative implementation of the language from within the C++ that makes a different set of guarantees and tradeoffs. This comes at a cost of no longer being able to use the standard library, particularly the vast legacy parts.

This is how I expect the issue to be addressed because there is too much invested in backward compatibility. Some C++ development, usually around high-performance high-reliability systems, already operates this way to a significant extent, using an alternative re-spin of the language primitives from the ground up that make stricter guarantees and enable greater composability and verification. C++20 in particular makes a nice base for building this.

You won't be able to force people to do things in a rigorous way, but providing a complete alternative goes a long way toward enabling safer code. I've seen alt standard C++ reduce bugs by an order of magnitude in real code bases (generally, since memory safety bugs are pretty rare in modern C++ in my experience).


C++20 doesn't stop someone writing new/delete/malloc/free, or using pointer arithmetic. For me to be sure your code is 'safe', I'd have to audit it visually, and that's the worst of all worlds.

The rigor can be put in the tooling, at the compiler level, as a flag. Then there's no 'forcing', it's just not possible to do certain things. Which is what is being advocated.

Hoisting it a level further, to the build system, means downstream dependencies can (also) be compiled 'safely', or required to be safe, by default.

Sticking a [[safe]] attribute into the language for regular source files would be a disaster. What's the point of one safe function in a library when the rest of it isn't safe?

Having said all that, I don't disagree with anything you've written regarding idiomatic c++20 reducing safety issues, it's just that it's a problem that can be and should be addressed mechanically in the tooling. I'm just really, really, doubtful that C++ will ever get anywhere near 'safe' in the next decade, or, really, next couple of decades.


Any advice on how to write modern C++20 while avoiding all of the usual pitfalls, if that option is already available to C++ devs as of today? Is there a "C++ in 2024, the good parts" type of resource one could learn from?


I saddens me that smart people like Herb and Bjarne have good ideas on how to make C++ a much safer language but the actual output of the standard's committee is so far behind.

Herb mentions std::span as as safety improvement but the [] operator doesn't do bounds checks and the .at() method isn't even there yet!

Shameless plug but I discuss this issue on my blog: https://btmc.substack.com/p/memory-unsafety-is-an-attitude-p...

Herb and Bjarne have the right attitude, sadly it doesn't seem enough C++ devs or people in the standards committee do. Same applies to C.


At $work, the standard solution to ASAN reporting use-after-free issues is to... not run ASAN builds. The fact that builds in CI exhibit random inexplicable crashes regularly every week doesn’t seem to make anyone have any second thoughts. A colleague once claimed that there is nothing wrong with out-of-bounds accesses as long as we don’t crash. The same bunch is also religiously opposed to using debuggers and regularly break debugger builds by removing debug information from builds at multiple levels, blocking ptrace debugging in system images through sysctl... This is all so toxic.


> A colleague once claimed that there is nothing wrong with out-of-bounds accesses as long as we don’t crash.

I need to find the source, but someone pointed out that the safety advantages of Rust are, in part, cultural and I increasingly agree. People use Rust because they care about memory safety and that care is reflected in the programs they write.


I keep getting code reviews with manual new/delete calls despite unique_ptr being 11 years old.

weirdly I see this most often from programmers who started college less than 11 years ago. Our accidemics are not helping.


The optimist in me would like to delude themselves thinking that most of the people smart/experienced enough to make the jump to unique_ptr from new/delete realized this is closing a porthole on the titanic and made the jump to something that isn’t C++.


I am not sure. C++ is a tool. I use what my company and companies in my domain use. I wouldn't mind using Rust, but there's just very little momentum. So meanwhile I do my best with what we have.

Personally, I care more about what I do than which tool I am using.


> I wouldn't mind using Rust

Rust is not the only thing that “isn’t C++”. Go is not appropriate for every domain either but you can bet your bottom dollar that it has taken market share from C++ - which I think the world is on the overall balance better off for, and I don’t particularly like Go as a language. Someone is making gobs of money off of OCaml.

> I care more about what I do than which tool I am using.

I don’t agree with the implication that these are independent factors.

And I do get it. The Rust ecosystem founders in many areas and the RIIR meme crew on forums is annoying. That doesn’t forgive the failings of the C++ ecosystem.


We have a lot of existing c++. I still cannot figure out how to mix with anything else. Nothing else wants to deal with vector for instance.


My personal opinion is that if companies would get real punishments for all of these goddamn security breaches, and the CTO's head was on the line, you'd see a shift in attitude real quick.


This in fact happened to Cloudflare with Cloudbleed and their decision was to switch to Rust.

There is a human factor at work here. Rationally speaking, Herb is right 98% reduction is sufficient. But when the CTO's head is on the line, they won't listen and switch to Rust.


Not all pieces of software are created equal. A desktop CAD application that doesn't do any networking and doesn't manipulate sensitive user data isn't worthy of binary exploitation. If there is adequate security at the system OS layer, at worst it will corrupt a user's file.

Infrastructure network code that runs on millions of servers worldwide is a completely different story. Being able to send a sequence of bytes that unlocks funny side-effects of a network service can be weaponised on a mass scale.


> Not all pieces of software are created equal. A desktop CAD application that doesn't do any networking and doesn't manipulate sensitive user data isn't worthy of binary exploitation. If there is adequate security at the system OS layer, at worst it will corrupt a user's file.

That software is almost certainly running on a network-connected machine though and likely has email access etc.. A spear-phising attack with a CAD file that contains an RCE exploit would be an excellent way to compromise that user and machine leading to attacks like industrial espionage, ransomwear, etc...


If you've fallen victim to phishing you're hosed anyway as a malicious process can read and write to the address space of another process, see /proc/$pid/mem, WriteProcessMemory(), etc.


There's a spread of things that can happen in phishing; I would expect that it's a lot harder to get a user to run an actual executable outright than to open a "data" file that makes a trusted application become malicious.


In order to read or write /proc/pid/mem your process needs to be allowed to ptrace() the target process. You can’t do that for arbitrary processes. Similar story for WriteProcessMemory().


Above your security context, no, but you can definitely WriteProcessMemory any other process that is in your same security context or lower (something similar holds for ptrace, although remember that SUID/SGID binaries are running not at a same security context)


Those are increasingly rare. Nowadays you have all these apps requiring subscriptions and expecting users to login and what not.

But I agree it depends heavily on exactly what application are we talking about. Is it running on server? Definitely needs to be security conscious. Is it a library that might at some point be used by an application running on a server? Needs to be more hardened than a nuclear bunker.


They are all pieces to a puzzle. If you can add or modify a CAD file to a location a user of the desktop software will access, a defect in file parsing could provide user level remote access to you. Then if any other application or library has a privilege escalation, you have rooted the box. And even if there is no privilege escalation on that box, how many more CAD files can that user modify to spread the remote access?


You're assuming they have security breaches due to C++.

I'm betting here they don't; if they have security breaches it's due to '1q2w3e' being the password to their world-accessible PHP admin panel, and not because of C++ code.


Using C++ doesn’t mean you must have security issues. It means that you have to do more things right in your other work to avoid them, and we have several decades of experience showing that even very good teams struggle with that. The more separate concerns teams need to manage, the more likely it is that someone will make a mistake with one of them – and since time is finite, the attention spent on memory management is taking away time which could be spent on other classes of error.


For every 1 security breach due to C++ memory management, there are at least 100000 due to shitty PHP code that doesn't escape strings or uses plaintext passwords that never change. (This is a conservative estimate.)


Can you cite your sources on that analysis? Be sure to include the relative affected numbers so we don’t count an exploit in Chrome the same as a PHP exploit affecting a dozen people using someone’s obscure WordPress plugin.

Another way of thinking about this, why are all of the browser teams who have some of the best C++ developers in the world and significant resources adopting memory-safe languages? Nobody at that level is doing that because it’s cool, so there might be something to be learned from their reasoning.


> why are all of the browser teams who have some of the best C++ developers in the world and significant resources adopting memory-safe languages?

They aren't. Even Mozilla abandonded their Rust-in-Firefox project.


PHP (the language) has long since moved past awful practices like that, and we can definitely tell people to stop doing that and use the provided safe alternatives instead. In fact, the PHP docs do just that. PHP is no longer to blame here.

Also that number is greatly exaggerated. It's simply not true anymore, check the CVE website if you don't believe me.


Here is Dennis Ritchie proposal for fat pointers in C.

https://www.bell-labs.com/usr/dmr/www/vararray.pdf

It is a culture thing, eventually the authors don't have any last words to say, if they let the community rule the language design, and their voice is equally one vote.


> This is a version of a paper published in Journal of C Language Translation, vol 2 number 2, September 1990

This says it all really. Nothing more needs to be said. Unfortunate.


I haven't read the linked paper, but both CPU speed and RAM available have increased about 100x since 1990, and nobody then had uttered the words "threat model". Some approaches that are sensible now were reasonable to overlook in 1990 for being too heavy.


Check when Morris worm came out.

And by the way,

"A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980 language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."

-- C.A.R Hoare's "The 1980 ACM Turing Award Lecture"


The Morris worm affected around 2000 VAX machines a couple of years previously, and was the first ever such incident on that scale. In other words, almost nobody in 1990 had been affected by a computer security incident. It didn't make sense in 1990 to prioritise this security threat over efficiency concerns.

Insisting on memory safety back then would be like insisting on code being accompanied by checkable formal proofs of correctness now: It's a technique that can be applied right now and that does improve safety, but it comes at such a cost that the tradeoff only makes sense for a handful of niche applications (aerospace, automotive, medical devices).


Yeah, that is why we didn't had to buy anti-virus software, duh.


Viruses in 1990 propagated by people running .EXE files they copied from somewhere, or booting floppy disks they found somewhere.

Tell me how bounds checks on array accesses would have prevented that.



> 01 JUN 2004

Got anything relevant?


Yes but the same story kept repeating over the years. C89 had a good excuse. C99 was iffy with the VLA stuff instead of proper slices. What excuse did C11 have?


> but the actual output of the standard's committee is so far behind.

That criticism misunderstands the actual way the C++ committee works. They're not a supreme legislative group that can dictate what a new C++ should be and everybody else is required to just obey. In contrast, Apple can dictate what the next version Swift will do. Microsoft can do the same with C#.

Instead, what happens in C++ is somebody or company makes a proposal with some concrete working code for others to evaluate and vote "yes" on. So the reality means one of the teams from MSVC, gcc, clang, Intel C++, etc have to take the lead with a Safe C++ alternative that convinces everybody else to implement.

To Herb Sutter's credit, he did make a "C++2" implementation: https://github.com/hsutter/cppfront

But his side project at Microsoft didn't gain traction with gcc, clang, etc and everybody else in the industry. So at this point, the C++ committee will be perceived as "so far behind" ... because there's nothing for them to vote on.

Similar situation happened with "breaking ABI compatibility". Google wants to break ABI but others didn't.


And that is why while C++26 is being discussed, C++20 modules are still full of warts, working on Visual C++ (kind of), and not really anywhere else, as ISO C++ has long stop being about standardizing stuff with actual field experience.


>And that is why [...] C++20 modules are still full of warts, working on Visual C++ (kind of), and not really anywhere else, as ISO C++ has long stop being about standardizing stuff with actual field experience.

Yes, but your complaint about flawed C++ standards or incomplete implementations is orthogonal to what I was writing about.

I'm just explaining that the C++ committee doesn't have the power to impose changes that some people think it does. Basically, I'm saying "the sky is blue" type of statement. The C++ committee is a reflection of what the members' compiler teams want to collectively do. The committee doesn't have unilateral supreme power to dictate standards to compiler teams not interested in it. Instead, they collect feedback from people sharing proposals in papers and put things to a vote. The compilers' teams have the real power and not the committee. (What I've stated in this paragraph should be uncontroversial facts but the downvoters disagree so I'd love to hear them explain exactly what power and agency they think the C++ committee actually has.)

If one understand the above situation, then the following different situations shouldn't have any mystery as to cause & effect:

- How did the std::chrono get added to the standard library? Because Howard Hinnant made a proposal with working code, and others liked it enough to vote it in.

- Why is there's no _standard_ cross-platform GUI and audio library in C++? Why is there not standardized Safe Syntax C++ like Rust? Because nobody has made a proposal that convinced enough of the other members to agree to a x-platform GUI and audio framework.

- Why does C++ committee adds features and prioritizes things I don't care about? Because the <$feature$> you cared about wasn't proposed for them to discuss and convince others enough to vote it in.

But yes I do understand the "warts" complaint you're talking about. It's frustrating that the std:regex with bad performance got approved. In similar examples, N Josuttis in multiple conference videos has complained about the problems with ranges and views. He says it was wrong for the C++ committee to approve it. (footnote: The original author of the proposal tried to explain his technical rationale: https://old.reddit.com/r/cpp/comments/zq2xdi/are_there_likel... )

To reiterate, I'm not trying to explain away bad language standards. New features that will have flaws will continue to happen in the future whether it's created by a singular corporation like Apple(Swift) or a cooperative group like the C++ committee.

I'm just explaining why some "wishlist desirable C++ feature" isn't going to be in the standard if there's no proposal that convinces the other members to vote it in.

EDIT to reply: >When we complain about the "committee" [...] the things they choose to propose and vote for.

The C++ committee members are not static but the webpage has list of names : https://isocpp.org/wiki/faq/wg21

Clicking on various PnnnnR.pdf proposals that motivated each feature in the conformance table shows most authors are not from the actual committee members: https://en.cppreference.com/w/cpp/compiler_support

Using the above workflow to address your complaint about std::span and at(), I found this comment from the original author Tristan Brindle who proposed it and why he thinks the committee voted no:

2019-10-18T22:55:30z https://old.reddit.com/r/cpp/comments/djqdu2/why_is_stdspan_...


I can't edit anymore so sending a second reply.

That reddit link is actually showing the problem to be worse. It's not that someone forgot, it's that the committee are absolute goddamn clowns.

Incredible.

Since the committee is a reflection of the larger C++ community, it's not even a case of a few bad apples spoiling the bunch, it's more like there are a few really good apples that are being bombarded with fungal spores on a daily basis by the rest.

Their justification for not having .at() makes absolutely no sense! Contracts, had they made it in, would have been for fixing []. Since that didn't happen, .at() was pretty much mandatory to have (and the clowns are adding it in C++26).

Severe attitude problem.


When we complain about the "committee" we're not complaining about some amorphous entity but rather the people that make it up and the things they choose to propose and vote for.


Not a C+++?


Even if not mandated by the standard, concrete standard library implementations do provide bound checking on span (and vector, and optional, etc.), but, even when meant for production use, are disabled by default.

And I don't see a big push in the community to enable them. I think the committee is just an expression of the community on this front.


> Even if not mandated by the standard, concrete standard library implementations do provide bound checking on span (and vector, and optional, etc.), but, even when meant for production use, are disabled by default.

That's a choice though. You can enable these in your production builds if you want (with libstdc++ at least) and some Linux distributions have chosen to do just that.

The thing though is that these checks are NOT free and the overhead is not justified for all use cases so forcing them on everyone is not appropriate for C++.


> The thing though is that these checks are NOT free and the overhead is not justified for all use cases so forcing them on everyone is not appropriate for C++.

Well, that's why they should be a flag. The question is whether it should be enabled by default or not.


It should be enabled by default, and if you want to index without bounds checking you should have to write something like a.unsafe_at(i)


> Herb mentions std::span as as safety improvement but the [] operator doesn't do bounds checks and the .at() method isn't even there yet!

You mean this implementation? https://en.cppreference.com/w/cpp/container/span/at

To quote: "Returns a reference to the element at specified location pos, with bounds checking.

If pos is not within the range of the span, an exception of type std::out_of_range is thrown."


You missed the Std column => (C++26)


> ...the actual output of the standard's committee is so far behind

The committee just publishes documents. It is actually far ahead of C++ implementations.

The committee would probably move faster if there were more attention (and funds and volunteer work) spent on advancing C++ implementations. This especially seems true for safety and security concerns as they tend to have more tangible problems to solve than the other kinds of standards proposals.


You're not far ahead if you're running in the wrong direction.

The standard is prioritizing the wrong things. It's normal that implementations are struggling when they need to implement something as complicated as C++ modules for example. There's no excuse for the .at() method being missing from std::span.

On the C side of things the problem is more egregious, it took over 30 years to standardize typeof after every compiler ever had already implemented something of the sort. GCC's __attribute__((cleanup)) should have definitely been standardized ages ago with so many libraries and codebases relying on it.

What does the C standard give us instead? _Generic. It's just silly at this point.


> There's no excuse for the .at() method being missing from std::span.

The issue is that there are two camps. One believes that precondition failures should not be recoverable and should abort the application and thus think that 'at' is an abomination. The other believes that throwing exceptions on the face of precondition failure is appropriate.

Hence what goes into the standard depends on how many people on each camp are present at each committee vote. This is also one of the reasons why the contracts proposal is not yet in the standard.

On a more practical side, .at does not help in any way to bound check the hundreds of billions of existing lines of C++.


std::span came out in C++20, by that logic neither did it help in any way...

Personally operator [] should abort by default because otherwise it is redundant with .at()


Of course aborting in span::operator[] wouldn't be enough. But bound checking in operator[] for vector, deque, std::array and normal arrays would help (I think it is infeasible to do it for arbitrary pointers).


> (I think it is infeasible to do it for arbitrary pointers)

I think there's a viable path that could solve it well enough for a safe compiler mode to be feasibly mandated in secure coding standards.

Pointer values can come from one of several operations: pointer offsetting, int-to-pointer, address of known object (including especially array-to-pointer decay), uninitialized values, function parameters, struct members, and globals. Safe bounds information is automatic for uninitialized values and addresses of known objects, and pointer offsetting can trivially propagate known bounds. If you had annotations ("the size of this array may be found in variable X"), you could usually get pretty reliably information for the last three categories.

The only truly difficult case is int-to-pointer, but from what I've seen in other contexts, it's likely that int-to-pointer in general is just an inherently cursed operation that auto-safe code just shouldn't support.


Well, the point is safety for existing code. If you can annotate pointers to match them with their bound you can as easily replace them with a span and avoid needing compiler heroics.

Edit: unless you absolutely need the change to be ABI stable, but even then there are ways around that.


> Shameless plug but I discuss this issue on my blog:

> “First make them care, then make it easy for them to do the right thing.”

I would say first make it easy to do the right thing, making them care will be an Eternal September.


Sadly even if you make it easy to do the right thing, without the right attitude to match it matters little. Some people still concatenate unsanitized input with raw SQL strings despite the abundance of libraries that make creating safe queries easier.

At this point I think heads need to roll before people take the problem seriously.


I wasn’t clear: we do agree that attitude is the crucial piece, I just disagreed on the order in which it should be done.

I think that implementing in compilers the mechanisms for doing the right thing can be done first, and relatively quickly. Which now that I think more about it, would need the right attitude on the part of the compiler vendors and standards committee.


Being able to use a foreach loops is a decent improvement.

And since you now have a proper access API, you could on theory enable bounds checks via compiler flag, even for operator[].


> not all code can be easily updated to conform to safety rules (e.g., it’s old and not understood, it belongs to a third party that won’t allow updates, it belongs to a shared project that won’t take upstream changes and can’t easily be forked).

I have to say, this sentence annoys the heck out of me.

Old code that can't be understood needs to be rewritten anyway. And since you're rewriting, you can apply the safety rules -- which would be better anyway.

If the code belongs to a third party and they don't want to update, it means the third party is playing against you -- or, worse, that they control how you're going to move forward. It's in your best interest, in this case, to rewrite the third party dependency as soon as possible, and since you're rewriting, you can apply the safety rules -- which would be better anyway.

If the code is shared and the upstream does not accept patches and -- worse -- doesn't support forks, then you have an issue with the upstream anyway and -- guess what -- you'd be better rewriting it anyway.

All those issues are managerial issues, not software issues. Management decided that it is better to be stuck than moving forward. And they can happen with any languages, not just C++.

Adding these things as "we can't be totally safe" are like "I can't jog every afternoon 'cause bears/alligators/wild life may jump an attack me". It's pure excuse to NOT do things.


> I have to say, this sentence annoys the heck out of me.

> Old code that can't be understood needs to be rewritten anyway.

I don't think it's that old code can't be understood, you can always understand what code is doing mechanically.

it's a question of can you predict the consequences. For example, if I rename this database column, what in our systems that have been built over the last 30+ years will explode? That's data rather than code, but the underlying idea is the same.

What happens is the very act of rewriting it puts you at risk of adverse effects.


I just can't help but have the feeling that the ship is sinking and this article busies itself with arguing over how much spackle is sufficient to plug one of the many holes. There is too much UB allowed in the C++ spec to be worth saving at this point. Focusing only on CVE reduction still punts on all the money and time wasted with buggy software because the language cannot do much to prevent data races, for example.


There is a huge amount of large C++ code bases out in production that won’t be rewritten in a different language anytime soon. Improving C++ from within there is a worthwhile strategy.


A lot of these improvements sound like the code will need to be rewritten anyways. Granted, it might allow for a more gradual rewrite.


Gradual and targeted rewrites are at least an order of magnitude more affordable.


There isn’t too much UB for it to be fixed.

Fil-C fixes it (albeit just for C, for now). CHERI fixes it (and it works great for C++). There are other systems that fix it, too.


Unfortunely Intel and AMD keep shoting themselves in regards to memory tagging implementations.


Yep. The parallel universe in which this could have been fixed in a sound way would have had C fixed first (bounded pointers, semantics of UB made similar to JVM, etc) and C++ would have adopted that.


C++ also brings the baggage of dogshit syntax and extremely verbose and confusing types and error messages that new languages do not suffer from. I get why this guy who has built his career in C++ is defensive of C++, and certainly existing codebases cannot be migrated overnight or even ever, but I would not start a new project in C++ if I could help it.


[flagged]


> You sound like you don't have much programming experience.

Based on aesthetic judgments? That makes so little sense that one can make certain guesses about the motivation for such a comeback—best left unsaid.


The cognitive burden of developing in C++ anymore is excessive, but that appears impossible to recognize without completely stepping away from the language for a few years.

I loved the language; I even used it for web development for a couple of years and it was enjoyable to do so; but I did step away, and at this point I'd rather use Rust (a language I neither love nor find enjoyable).

Not sure the C++ committee have the capacity to "fix" this (whatever that may mean), nor whether the industry would let them, even if they were able to recognize the cognitive burden as a problem.


There is a certain masochism and macho attitude in C++, see the C++ frameworks of the 1990's like OWL, VCL and CSet++, where C++ was as expressive as you could expect from a .NET or Java framework a decade later, something that only lives on C++ Builder and Qt/QtCreator.

Both not really welcomed in most C++ circles, where naturally one codes close to C, with a thin layer abstraction wrapping OpenGL/Vulkan/DirectX in imGui for the ultimate performance, meanwhile the same authors use an Electron based application to write their code.

This is what made eventually move away into managed compiled languages, C++ isn't really the same as used to be, my next favourite programming language similar in spirit to Object Pascal, as in the last century of desktop GUI frameworks.


We're lucky that the LLM stuff came about after C++ was already in decline. Or could there be a resurgence of C++ if we can teach LLMs to tackle what humans can't?


Comparing CVEs is unfair, in the past there have been CVEs in the Rust for things which in C++ were just marked “won’t fix” (this one: https://blog.rust-lang.org/2022/01/20/cve-2022-21658.html )


Links to where C++ marked it "won't fix" please (hint: you won't find them, both libc++ and libcxx treated them with severity & promptness and resolved the issue quickly)


But still, there was no CVE filed, so comparing Rust to C++ on that front is apples-to-oranges. Rust has a lower bar for what can be considered a serious flaw.


The libcxx & libc++ bugs both just pointed at the Rust CVE as a "this impacts us" - it doesn't seem like anyone bothered to file a CVE because they were too busy just fixing the issue. I don't see any evidence of this being some sort of standard for CVEs, especially since C++ doesn't have a CVE committee at all.


Having recently had to pick up C++ again after over a decade in Ruby/Python/Clojure/Haskell/TypeScript land, it was enlightening to see how much the language is a window into a previous era of programming, full of boilerplate and footguns and dreams for programming patterns that ended up non panning out.

It feels like the programming community can do so much better, having seen the other side of it, but there's an immense amount of legacy code that's holding us back. Especially if you're going into more serious game dev territory, it seems practically unavoidable.


I think it is remarkable how important ABI is to C++. I think 'making it easier to enable them' i.e. enable the sane defaults went down the drain once they locked in on not breaking ABI.

I am sure that Herb wrote this with good intentions, but no concrete measures was proposed but trying to standardize compiler tools, and to some extent downplay Rust.

I agree there is a misconception around programming safety, but this reads the same as when big industry say the want to focus more on climate change.


This post contains a number of statements that mislead the reader into believing that we are overreacting to memory safety bugs. For example:

> An oft-quoted number is that “70%” of programming language-caused CVEs (reported security vulnerabilities) in C and C++ code are due to language safety problems... That 70% is of the subset of security CVEs that can be addressed by programming language safety

I can’t figure out what the author means by “programming language-caused CVE”. No analysis defines a “programming language-caused CVE”. They just look at CVEs and look at CWEs. The author invented this term but did not define it.

I can’t figure out what the author means by “of the subset of security CVEs that can be addressed by programming language safety”. First, aren’t all CVEs security CVEs? Why qualify the statement? Second, the very post the author cites ([1]) states:

> If you have a very large (millions of lines of code) codebase, written in a memory-unsafe programming language (such as C or C++), you can expect at least 65% of your security vulnerabilities to be caused by memory unsafety.

The figure is unqualified. But the author adds multiple qualifications. Why?

[1] https://alexgaynor.net/2020/may/27/science-on-memory-unsafet...


After using the modern C++ more than 10 years, I don't believe the "safe subset" is sufficient to solve the security problems in C/C++. It does help, but not enough to keep the pace of modern software complexity. Some of localized anti-patterns could be prevented, but any non-trivial memory bugs across team boundary won't be.


Rust started at Mozilla. Their products were written in C++, and despite all the effort put into good practices, they kept running into the same issues over and over. They designed a new language that would prevent those issues, and that became Rust.

Personally, I think C++ has too much cognitive load. Over multiple decades, many features were added and very few were removed. The result of that process is having code that reads like mixing Latin with Ancient Egyptian hieroglyphs and then followed by TikTok jargon.


> Personally, I think C++ has too much cognitive load.

I also think the same thing about Rust.

RAII adds a humongous cognitive load to a language and I'm not really sure what you do about that.

Zig is a step in the right direction with a bunch of the sharp corners in C filed down--slices and default non-null pointers is a big improvement.

However, there are some things (like reference counted data structures) that are really annoying to implement in C/Zig that are really easy to implement in C++/Rust.


> RAII adds a humongous cognitive load to a language

How does it add cognitive load? Compared to manual resource management it certainly removes a lot of cognitive load. And what is the alternative?


The RAII from C++ is a much larger burden.


In what way is RAII a burden?

Genuinely curious, since I find it quite easy to work with.


When deinit occurs far away from init, RAII starts adding a lot of cognitive burden.

Once a thing "escapes" from where it was created/initialized, it suddenly has a life cycle. Everything in that thing shares in that life cycle. It can cross multiple threads during that lifecycle. When that thing completes its life cycle is unbounded. Consequently, you can run out of intermediate resources (memory, file descriptors, database transactions, etc.) even though you would have enough if they could be reclaimed right now but you can't prove that you can do so.

This is one of the reasons why Rust exists and why it defaults to move semantics. Everything that you need to deallocate a thing is present at all times--you own the thing. If you borrow the thing, you cannot deallocate it. Life is good.

Sorta ...

Sometimes somebody else owns and controls the thing. Sometimes you initialize once and then everything is read-only from that point forward--synchronizing on a single runtime event gate. Not everything is memory and has bounded reclamation time. Sometimes things want different allocators and allocation strategies--you will reclaim some things every 16ms and some things very rarely. Sometimes you want to allocate on one thread and deallocate on another.

A lot of these don't fit RAII all that well because there is a time dimension to them rather than just space.

But then you have things like "reference counting" that's just an absolute nightmare to do without some mechanism like RAII. It's easy to always increment/decrement the reference count, but your performance is terrible--you need compiler elision of those for performance. I tried implementing a reference counted Scheme in C and Zig, and I wanted to blow my brains out from hunting all the reference counting bugs while C++/Rust would have been a breeze. Perhaps I just made a terrible architecture. ¯\_(ツ)_/¯

I'm not sure what the solution should be.

Rust took a very hard line about not paying runtime costs for things you don't use--that means lots of compile time effort, debug runs that are glacially slow, and blocking certain standard idioms behind "unsafe".

In reality, I'm actually willing to pay more at runtime than I thought as long as my runtime is deterministic. I'm willing to pay a bit at runtime to get a compiler that's two orders of magnitude faster whose debug code is only 25-50% slower than standard. I'm willing to pay at runtime for null and bounds checks. I've got a zillion cores doing nothing--giving up 10% to get a nicer language is perfectly acceptable to me.


Normally you allocate on the stack, so the local function owns the object. You pass (unowned) references to any function you want to call. Those functions are not concerned about RAII for those references, since they don't own them.

If you want to pass ownership to a different function/thread, you move the object. It's the owner's responsibility to run all the destructors once the caller deletes the object, which RAII does for you. Granted, this can get undeterministic with the reference counted shared_ptr, since only the last owner of the reference will actually delete the object.

I actually really like RCU for shared access: https://en.m.wikipedia.org/wiki/Read-copy-update Deallocation is always done in a background thread and there is no synchronization when reading. I'm not sure what public libraries are available, unfortunately (I've only used our internal rcu library).

If you want to make efficient use of your zillion cores, you need to make sure single threaded performance is excellent ;) See Amdahl's law.

But I agree, development velocity is an important factor to consider when choosing languages.


> I actually really like RCU for shared access: https://en.m.wikipedia.org/wiki/Read-copy-update Deallocation is always done in a background thread and there is no synchronization when reading. I'm not sure what public libraries are available, unfortunately (I've only used our internal rcu library).

I do like "eventually consistent" data structures like this where you either see the old one or the new one consistently.

However, at that point, you've basically created garbage collection as you've lost your deterministic behavior. (It will get collected--sometime, maybe).

They mention the Linux kernel and reference counting specifically, so I'll look at this more in depth.

> If you want to make efficient use of your zillion cores, you need to make sure single threaded performance is excellent ;) See Amdahl's law.

Erm, that's NOT what Amdahl's law says.

"The overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used"

Cores mostly spend their time waiting--so improving single core performance isn't a great benefit. In fact, going backwards to lot of simpler cores but lots more cache per core is probably a better benefit.

Apple got this. They made the memory system a giant cache and got a huge performance boost.


Mozilla also has struggles staying competitive with its core products. The main criticism of Rust as a solution to safety concerns is about that kind of outcome.

Obviously Mozilla's challenges are multifaceted and don't boil down to user-invisible code rewrites. But, at best, switching to Rust was an orthogonal engineering concern that has opportunity costs.


Firefox is competitive, their struggle is primarily that their competition is Microsoft, Apple, and Google, who have both practically endless wells of money, control over the computing platforms, and vastly more mindshare.

There's no technical solution to that problem.


Yeah, that's a valid take. But if that's true, then Rust isn't giving them an edge (yet?). So it isn't the most important engineering task.

If you're saying there's no technical path to success for Mozilla... well, that might be true, but I could see that being demotivating for Mozilla engineers.


It's interesting to compare these opinions with those of the Google Security team that were released a little over a week ago [1].

Key quotes from the Google paper:

> We see no realistic path for an evolution of C++ into a language with rigorous memory safety guarantees that include temporal safety.

> In our experience, it is not sufficient to merely make safe abstractions available to developers on an optional basis (e.g. suggested by a style guide) as too many unsafe constructs, and hence too much risk of bugs, tend to remain. Rather, to achieve a high degree of assurance that a codebase is free of vulnerabilities, we have found it necessary to adopt a model where unsafe constructs are used only by exception, enforced by the compiler.

It seems like Herb Sutter at least partially agrees with the second point in his TL;DR:

> I just want C++ to let me enforce our already-well-known safety rules and best practices by default, and make me opt out explicitly if that’s what I want.

[1]: https://security.googleblog.com/2024/03/secure-by-design-goo...


> Rather, to achieve a high degree of assurance that a codebase is free of vulnerabilities, we have found it necessary to adopt a model where unsafe constructs are used only by exception, enforced by the compiler.

Yeah that sound like most security researcher i have talked/listen to. They see the problem as safety maximization problem. While in the real world, there are 50 different other constraints that a language design needs to navigate.


> We see no realistic path for an evolution of C++ into a language with rigorous memory safety guarantees that include temporal safety.

The point Herb was making is that "rigorous memory safety" isn't the only bar, nor should it be. Saying there is no way to make C++ have rigorous memory safety is not the same as saying C++ can never be made safe.


I have tremendous respect for Herb Sutter and I can understand where he is coming from, but things like

"All dereferences are null-checked. The compiler injects an automatic check on every expression of the form *p or p-> where p can be compared to nullptr to null-check all dereferences at the call site (similar to bounds checks above). When a violation happens, the action taken can be customized using a global null violation handler; some programs will want to terminate (the default), others will want to log-and-continue, throw an exception, integrate with a project-specific critical fault infrastructure."

make me just feel sad about the future of C++. The problem is not only CVEs, the problem is that the language makes doing the right thing hard and doing the wrong thing easy. All larger C++ projects I've worked on already use an enormously complicated and fragile combination of compiler-warnings-as-errors, static analyzers and linters to fix these problems and standardizing these is a worthwhile effort to reduce differences between different compilers and tools and to get to a bare minimum standard that everybody can agree on. The problem is that most projects won't adopt these until the 2030s or 2040s and then they will only create a bare minimum bar of safety and convenience when new projects could just use Rust and get a very well thought out safety model and other benefits (no inheritance, const-by-default, etc) at little extra cost.

The type system and metaprogramming capablities of C++ are really nice (when compared to Rust), but at this point the language is only relevant for the existing (thriving) codebases that use it and for historic reasons, but no sane person would choose it for an new project anymore, which makes me sad. I don't see how C++ can reinvent itself and stay relevant without a radical break that takes the last 20 years of language development into account. Also, finally put some coroutine compatible event loop in the standard library FFS, it's 2024.


“We must make our software infrastructure more secure against the rise in cyberattacks (such as on power grids, hospitals, and banks)”

How about not connecting your critical infrastructure directly to the Internet.


We have to work with the world that is, not the world that ought to be. Critical infrastructure is going to continue being accessible, so the people in charge of implementing those systems should have the appropriate tools to deal with the challenges it entails.


While other communities are already taking direct steps towards safety, the C++ community is still trying to define what safety means. I think it's funny and sad at the same time!

I didn't read the article (just browse it) but here's the TLDR from the article itself:

``` tl;dr: I don’t want C++ to limit what I can express efficiently. I just want C++ to let me enforce our already-well-known safety rules and best practices by default, and make me opt out explicitly if that’s what I want. Then I can still use fully modern C++… just nicer. ```

As is normal in C++, the defaults are wrong. Developers should "opt in" for unsafe instead of "opt out" of it!


> Developers should "opt in" for unsafe instead of "opt out" of it!

Why ? C++ guiding principle is zero cost abstractions.


It's "zero cost abstractions over what you would write by hand". If you argue that anyone doing array access should be doing bounds checks when in doubt, a C++ compiler performing bounds checks would still be considered zero(additional)-cost.


Well, when you are not in doubt you don't want unnecessary bounds checks.


If you can communicate to a human that a bounds check isn't necessary, you can communicate it to a compiler.


I'm all for better tools to help the compiler figure things out. Here is an example where I can't communicate the invariants to the compiler:

``` std::vector<int> v; ... v.push_back(2); std::sort(v.begin(), v.end()); // no need to check i < size because we know we will find value 2 somewhere in the v. for (int i = 0; i < v.size(); ++i) { if (v[i] == 2) return i; } ```

Note that in C++ you can manually mark code after the loop as unreachable, which would indeed skip the size check. But that's as bad as not checking bounds in the first place.


No...


> As is normal in C++, the defaults are wrong. Developers should "opt in" for unsafe instead of "opt out" of it!

Isn't this exactly what he is saying?


Safety as in undefined behavior isn’t in a subset of the language or safety as in memory safety? Because Rust tries to do more than memory safety and it’s quite amazing. Integer overflow isn’t an unheard of problem. The 787 had an integer overflow problem in C about a decade ago. The most recent drama with the entire FCU going dark seems eerily similar to what you might expect where a watchdog would reboot it.


Another day, another essay from a C++ guy who is salty that the government called it insecure and thus seeks to compare the best possible version of C++ with his misunderstanding of what everyone else is doing, and doing a poor job at that too.

The stuff Herb mentions as solving "98% of the problems" do not do that. First of all, they don't do that now, because they aren't on by default. At least he accepts this, but people have been claiming that C++ is great now because all these things are easy to do, and the fact is that nobody actually does it. Every other language is shipping this as table stakes and C++ hasn't even gotten its pants on yet. Second, in the surfaces that do actually care about this (think web browsers) they are putting a huge amount of time and money into solving this regardless of what the language does by default. We're talking automatic initialization, bounds checks enabled by default, all sorts of smart pointers and "production" sanitizers. They are still being exploited. Yes, your average script kiddie isn't taking down Chrome these days. But nation states still do it. They don't care that you go to 98%! They find a bug, they exploit it.

The rest of the post is just the usual talking points against every other language that exists. A lot of it is wrong. I'm not going to dwell on the whole "Rust takes CVEs that C++ doesn't" argument because I think this misses the point, personally (Rust has a different safety model, so problems under that safety model are just as valid problems with Rust. You can't go "oh we are safer by default" and "oh actually this makes things a lot harder for us so don't judge us when things are broken). However, a lot is just wrong. Languages that do not corrupt their VMs on races are better in many ways than C++, because in C++ anything of that sort is basically a direct path to code execution. In other languages you have to work for it. In Rust people use unsafe–but that doesn't mean it's automatically as unsafe as C++. It just means that people write a little bit of code using it and then build on top of it. In garbage collected languages the runtime gets super upset if you hold on to references when it thinks you shouldn't. However, doing this is a performance and in some cases an issue of leaking resource. It is not comparable to, say, a UAF in C++. Basically, you can't just go "oh every other language has [problems I looked up on Google] this means they are also bad". Because in C++, the problems almost invariably lead to a foreign government taking control of your phone. This is not true for other systems.


So rewrite the browser in Rust!

Rust is 10 years old this year, yet there are no production level browsers or operating systems written in it!

I’m talking about Google. They come up with article after article about the failings of C++, yet would prefer starting their own programming language Carbon, continue using C/C++ for chrome, zircon and android.


Google is investing in bringing up Rust in the latter. Carbon is by another team, of which there are many at the company.


> If there were 90-98% fewer C++ type/bounds/initialization/lifetime vulnerabilities we wouldn’t be having this discussion.

I disagree. We shouldn’t aim for 90-98%. We should aim for 100%.

Reason: unlike logic errors, memory safety bugs are exploitable more often than not. Logic errors are usually just some useless glitch and only sometimes exploitable.

> rust has 6 CVE’s

Couldn’t quickly check if any of those six were memory safety or not.

It’s sad that rust still has memory safety issues and if that’s true then my conclusion is that rust isn’t safe enough.


> It’s sad that rust still has memory safety issues and if that’s true then my conclusion is that rust isn’t safe enough.

Let's assume all the CVEs are memory safety issues (some are not, some are, in reality.) Even then, they're things like "a bug in the standard library" that was then fixed. All software has occasional bugs. It is not possible to never have bugs.


In workplace safety, at least in my country, there is a commonly accepted level to aim for: No accidents.

It makes sense in a way. If you don't aim for zero, what do you aim for? A few hands lost in a month? A few crushed fingers?

In this viewpoint I'd argue that aiming for a 98% reduction is a bit absurd: "Yeah we had an issue but it was the only one this month so it's within our budget."

Of course, goals are not the same as realised results. Getting a 98% reduction should be considered a huge, huge win. The next step should then be "what can we do to get rid of the rest?" Or at least the alternative seems odd: "We got this far, job done, nothing useful to do anymore."

Of course, the means will change. If C++ guarantees full memory safety next month, then the next steps after that won't be more memory safety but something else.


> there is a commonly accepted level to aim for: No accidents.

But you also balance that goal against practical reality. For example, you could end workplace accidents by outlawing work and having everyone starve to death, but that's not done because the costs are too high.


That's a really good way of stating what I believe. :-)

If C++ had memory safety next month then the next step would be to add stronger types. Once memory safety is a thing, you can start to add types that describe contracts and then you can ensure lots of interesting logical safety properties that aren't memory safety.


I think "bug in stdlib" is forgivable, and I wouldn't fault Rust for it, exactly for the reason you say. It would be a great outcome if Rust's compiler and stdlib became a trusted compute base and folks had to be extra careful there.

It's not a great outcome if there are memory safety bugs arising from how some Rust programmer did some stuff.

So, I would revise my statement to: "It would be sad if Rust had memory safety issues that any user of the language could run into, and if that was true, then my conclusion would be that Rust isn't safe enough."


We C++ developers like that C++ is easy to fuck up, because we think we know better, and we think that means job security.


Why not just use Go?


Go is a great language, but there also needs to be a language that you can program with zero overhead.


too slow...


Fast enough for large majority of projects on CNCF project landscape, where C and C++ are a minority.


Also Unsafe for those under the age of 18 /s


Replacing circular structures with IDs is memory-safe, but it can still blow up if there is no item with that ID. It may be safer, but it's not guaranteed correct or anything like that. In these cases I'm not sure what is gained by Rust over C++.


That depends entirely on the implementation and use case. An append only arena, like is used in compilers, do not have ever invalid IDs by construction: you get the ID back after inserting into the arena and have no way of removing the value after that. For a use case where arbitrary removal and editing of existing values is needed, generational arenas are used so the handle encodes the "generation" of the value. If on access the generation doesn't match, it means the same as a null pointer and you won't get a value back, in a language with sum types represented with a None sentinel.

Disregarding entirely the memory safety aspects of turning a pointer access to an index access, there are benefits for using arenas (or struct of arrays) like better cache locality on access/avoiding memory fragmentation, and bunching values that don't outlast each other into a single free operation.


The... memory safety? It will reliably blow up; that's quite an improvement already, when the alternative may be remote code execution.

There's also of course everything else, like the ergonomics, freedom from data races and so on, but I'll skip that.


I'm actually surprised their example wasn't the accidental use of a valid ID, not an invalid one. An invalid one is easily catchable, but an invalid one is akin to a bad pointer. The resulting behavior will slip by the Rust compiler, as we're basically using integers as pointers, and you can get all sorts of shadow-clone versions of malignant pointer behavior like use-after-free in such a scheme.

That said … there are ways in Rust to do circular data structures that don't have the problems that using indexing into an array has, and are memory safe, and are within the safe subset of Rust. (E.g., Rc/Arc.)


Whether software blows up or not if an ID is "bad" depends on how it's written. Writing a token to a database without cleaning it up could lead to a security hole that has nothing to do with memory exploitation. I worry the fixation on memory is creating tunnel vision.


Herb Sutter makes that point though somewhat hidden amongst all the apologetics... let's not forget there are security vulnerabilities that don't originate in memory access violations. "Easy" config mistakes, bad secrets protection, and human-factor-exposure are out there and don't depend on memory safety. Being humble about that would be only fair.


"Blowing up" immediately instead of corrupting random memory locations is pretty much what's desired though. Everything else is just icing on the cake.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: