So I'll turn my attention toward addressing some of the crashing/reliability that I've been neglecting. If anyone is finding it an issue right now, it may be better in the next few days or weeks. Again, feel free to post an issue in the repository if you're running into something specific you want addressed.
You figured out all the letters of the acronym. I'm kinda impressed :) I should probably state it in the documentation, but I made the names kinda comically verbose and ugly to imply that, while these names will always be valid, presumably shorter nicer aliases will be more commonly used. Choosing nice names isn't really part of my skill set. I may be subconsciously hoping the user community will at some point come up with a set of nicer aliases that will just be adopted as de facto standard :)
"rsv" stands for "requires static verification" (to ensure safety). A lot of elements in the library get their safety enforcement via the type system and aren't really dependent on an analyzer like scpptool for their safety. The ones that are generally go in the "rsv" namespace.
I can't remember what "mse" stands for right now. I'm pretty sure the "s" stands for "safe" or "safety". At one point, and possibly still, the idea was to support entities maintaining their own version of the library if they wanted (to completely eliminate dependency risk). In that case one might want to change all the "mse" namespace references to their own namespace name. scpptool is coded to allow you to specify the base namespace of your version of the library.
> There are some other practical issues with the project, such as inconsistent licensing annotations
Any licensing issue is just me being careless/lazy I assume. I don't intend for there to be any restrictions. I think I just tried to make everything original to the project use the boost license, which seems totally permissive to me. If I need to do something to fix the licenses, let me know.
> the fact that it seems to depend on specific Clang versions (and thus will probably bitrot if it stops being maintained)
The only reason there is any dependency on clang version is because the clang libraries keep making breaking changes on successive versions. It used to be really bad, but these days they're generally minor breakages. At this point if someone doesn't want to wait to update the version of clang that scpptool uses, it's pretty much just a matter of doing a search-and-replace for the version number in the build script and makefile, and then fixing whatever annoying thing the new version broke. A lot of times it's them slightly reorganizing the (large) set of clang libraries scpptool needs to build against (which requires modifying the makefile). One of the goals of this project is to be as small of a dependency risk as possible. I think it compares favorably to some of the alternative solutions in that regard.
Yeah. The provided build script produces a DEBUG executable (only). The NDEBUG version isn't tested regularly yet. But performance isn't really an issue, so the NDEBUG version has sort of been de-prioritized in favor of feature completion. The fact that it crashes at all in any mode is generally concerning (and a little ironic for this tool), but the crashes are generally due to the use of the (libtooling) clang library. Its (not-super-well-documented) interface seems to be designed for maximum speed, not safety. And I've found that some of the perhaps lesser used elements (including some used for static analysis) are just buggy, and the bugs sometimes change from version to version. But yeah, the tool is not yet in a well-tested or polished state. But I figure, even in a not-well-tested state it can only improve the safety of the code it analyzes/enforces. I mean, even if any bugs or shortcomings in the tool prevent it being able to fully achieve the design goal of ensuring that the analyzed code is completely safe, in any case it shouldn't result in the analyzed code being any less safe than without it, right?
p.s.: Despite complaining about and trying to deflect blame on the clang library, of course it is an amazing library without which none of this would be practically doable.
p.p.s.: Please don't hesitate to post any problems you encounter as an issue in the repository.
p.p.p.s.: At this point, volume is low enough that any feedback at all is welcomed as an issue in the repository.
p.p.p.p.s.: I didn't post this item. I was rather surprised to stumble upon it this morning.
> I didn't post this item. I was rather surprised to stumble upon it this morning.
Hey, I posted this because I've been following your repo for quite a while. Together with Circle's borrow checker and Cpp2, I think this is a promising approach to make C++ safer although I wish the committee would be more involved in these sort of experiments (something that probably won't happen anytime soon).
Hey thanks for your support! Committee interest might be nice, but I think what would be most helpful at this point is just additional resources in general, whether as a result of committee interest or otherwise. :)
Here's a feature comparison table that would go on the scpptool retail packaging. (HN supports mono-space font right?):
| scpptool | circle | cpp2 |
-------------------------------------------------------------
|addresses lifetime safety | Y | Y | N |
-------------------------------------------------------------
|addresses iterator safety | Y | Y | N |
-------------------------------------------------------------
|supports auto-conversion | Y | N | probably |
|of legacy C++ code | | | doable |
-------------------------------------------------------------
|reasonable support for | Y | N | not |
|cyclic references | | | safely |
-------------------------------------------------------------
|works with any C++ | Y | N | Y |
|compiler | | | |
-------------------------------------------------------------
Sure, that's certainly what such a feature comparison table would look like, a list of what you claim is great about your product, with check marks by it, and then everybody else's product doesn't have as many check marks.
But of course the reason you see these "comparison tables" on such products is that that's all they functionally are, a list of what you, the product's creator, think is great about your product, not actually a meaningful comparison.
Hi pizlonator, I'm working on a solution with similar goals (I think), but a bit of a different approach. It's a tool that auto-translates[1] (reasonable) C code to a memory-safe subset of C++. The goal is to get it reliable enough that it can be simply inserted as an (optional) build step, so that the source code can be maintained in its original form.
I'm under the impression that you're more of a low-level/compiler person, but I suggest that a higher level language like (a memory-safe subset of) C++ actually makes for a more desirable "intermediate representation" language, as it's amenable to maintaining information about the "intent" of the code, which can be helpful for optimization. It also allows programmers to provide manually optimized memory-safe implementations for performance-critical parts of the code.
The memory-safe subset of C++ is somewhat analogous to Rust's in terms of performance and in that it depends on a non-trivial static checker, but it imposes less onerous restrictions than Rust on single-threaded code.
The auto-translation tool already does the non-trivial (optimization) task of determining whether any (raw) pointer is being used as an array iterator or not. But further work to make the resulting code more performance optimal is needed. The task of optimizing a high-level "intermediate representation" language like (memory-safe) C++ is roughly analogous to optimizing lower-level IR languages, but the results should be more effective because you have more information about the original code, right?
I think this project could greatly benefit from the kind of effort you've displayed in yours.
My plan for Fil-C is to introduce stricter types as an optionally available thing while preserving the property that it's fast to convert C code to Fil-C.
C++ is easiest to describe, at the guts, in terms of C-style reasoning about pointers. So, the easiest path to convincingly make C++ safe is to convincingly make C safe first, and then implement the C++ stuff around that. It works out that way in the guts of clang/llvm, since my missing C++ support is largely about (a) some missing jank and glue in the frontend that isn't even that interesting and (b) missing llvm IR ops in the FilPizlonatorPass.
> the easiest path to convincingly make C++ safe is to convincingly make C safe first
Yeah, with all the static analysis, I did end up straying from the easy path. Ugh :) But actually, one thing that C++ provides that I found made things easier is destructors. I mean, I provide a couple of raw pointer replacement types that rely on ("transparently wrapped") target objects checking for any (replacement) pointers still targeting them when they get destroyed.
As you indicated in another comment, you explicitly choose to expose/require zalloc() because you didn't want to make malloc() too "magical" (by hiding the indirect type deduction). In that vein, one maybe nice thing about the "safe C++ subset" solution is that it exposes the entirety of the run-time safety mechanisms, in the sense that it's all in the library code and you can even step through it in the debugger. (It also gives you the option to catch any exceptions thrown by said safety mechanisms. You know, if exceptions are your thing. Otherwise you can provide your own custom "fault handling" code (if you want to log the error, or dump the stack or whatever).)
> There's a ton of literature on ways to make C/C++ safe. I think that the only reason why that path isn't being explored more is that it's the "less fun" option - it doesn't involve blue sky thoughts about new hardware or new languages.
I can't think of any other reason that makes sense either. Anyway, the first thing is to dispel the notion that C and C++ cannot be safe, and it seems like your project is likely to be the first to demonstrate it on some staple C libraries. I'm looking forward to it.
Some solutions aren't that well publicized. Here is an example of an open source png encoder/decoder written in C (mostly) being auto-translated to a memory-safe subset of C++:
If you're planning that far ahead it may not be an either-or situation. That is, in the future C/C++ may also have an enforced memory safe subset. In which case the issue becomes, how do you write your code today so that it will conform to the safe subset? There are conformance tools in the works that can already give you a sense of the restrictions that will be imposed [1].
Please don't use promotional accounts on HN. We're here for curious conversation, not promotion. I appreciate that you've been posting these links in threads where they're mostly relevant, as opposed to blanket-spamming the site, but it's still not in the spirit of HN to have single-purpose accounts or to use this place primarily for promotion. It's ok to post your own work occasionally, as long as it's part of a diverse mix of posts that are motivated primarily by intellectual curiosity. That's the value we're optimizing for here.
I think the answer is yes. There's a complementary project [1] that aims to enforce a slightly more restricted memory safe subset of C++ than the lifetime profile checker does (or eventually will). The restrictions do not manifest as limitations on the code, but rather as extra run-time checks in some scenarios.
The C++ language maps one-to-one to the safe subset, so auto-conversion of (reasonable) existing C++ code to the safe subset should be a straightforward (if tedious) undertaking. The issue will be the performance of the converted code, which will depend on how pointers/references are used in the original code.
Pointers that are expected to never point to (in addition to never dereference to) a destroyed object can be converted to "safe" pointers with little run-time overhead [2]. Otherwise the pointer would need to converted to one with more overhead [3]. Pointers that can be verified (by the static analyzer) to conform to "scope lifetime" rules (akin to Rust) can remain zero-overhead pointers.
New code written in the safe subset would generally have performance in the ballpark of traditional C++ [4].
Unfortunately, documentation [1] is still kind of lacking. Like Rust (and arguably modern C++ conventions?), the safe subset doesn't support pointer arithmetic directly. You would have to convert the pointer to an iterator (and its target to an appropriate container).
Auto-conversion of code that uses pointer arithmetic is challenging, but has been demonstrated to be solvable in the general case [2].
[1] https://github.com/duneroadrunner/scpptool