Rustacean Station had a great podcast episode with Quentin Ochem from AdaCore and Florian Gilcher from Ferrous Systems. They do a great job explaining what "safety-critical" means and the work that goes into validating software for such applications. I work in a slightly-less regulated field (medical devices) and found the discussion really interesting.
I don't see MISRA C as comparable to SPARK in this space. MISRA C is more like a style guide for writing less awful C.
There are huge swathes of MISRA which forbid things which not only aren't possible in Rust or SPARK, they'd probably make a lot of real world C programmers shake their heads in disbelief too. For example MISRA tells you not to put switch labels within blocks started by another switch label. Most C programmers probably had no idea they could do that, which is fine because they shouldn't. In experiments these MISRA rules don't trigger - nobody is actually writing the sort of C that's forbidden.
There are also a bunch of MISRA rules which just seem like style preferences, in experiments these don't seem to contribute to improved software correctness, the "bugs" identified by these rules don't interfere with correct program behaviour. Enforcement of such rules tends to breed resentment from programmers.
MISRA isn't useless, but that's actually an indictment of our industry. This very low bar is still tougher than a lot of code people are actually using. It's like you find out that your country's fire regulations only require a nightclub with 500 drunk people inside to have one working fire exit... and then discovering most clubs in the country don't meet this inadequate standard anyway.
You might say that following it without understanding the purpose of it, or understanding it and looking at code that doesn't follow it, can both make you MISRAble.
I agree! I think this is an exciting opportunity to reinvigorate the safety-critical programming landscape. Adding a popular and modern language with a great development experience could make safety-critical programming much less of a slog.
this specification is written based on the current behavior of rustc. The page even says that the specification will be updated as rustc is updated:
> If there is a mismatch between what the FLS says and how the compiler behaves, the specification will be updated.
So, rustc is not written to this specification, but rather this specification is written to match rustc.
So if I am writing my own compiler, using this specification, do I have to worry about the specification changing, if suddenly a regression is introduced to rustc, and the specification is updated to cover the regression?
mostly I don't understand. I'm sure someone could explain this and it will make sense to me.
If some old IE version did things a certain way, even the most modern browser will want to do things in a similar way to remain compatible. Therefore, the standards bodies will try to reverse-engineer the existing behaviors and then create standards based on those. That way, modern code can simply follow the spec and remain compatible.
The HTML5 parsing algorithm is an example of this. Old browsers tried to "fix" broken HTML by guessing where things like missing closing tags were supposed to go. The HTML 4 specification never described this logic, yet it was there in the wild. The new HTML 5 specification made a point of reverse-engineering the repair algorithms and actually documenting them, so now everyone can be compatible going forward, both with each other and with legacy. Just follow the spec.
The specification is versioned. So your compiler would need to be updated to work with the new specification, both with new features and with new bugfixes.
An analogy from the C++ world: In late 2016, there was an evolving C++17 specification, and the latest versions of gcc and clang were starting to implement C++17 behaviors. But you could read the C++14 specification to understand "this is what everyone agreed on in 2014", and invoke your compiler with the -std=c++14 flag to get that behavior. If you already had older software written against the C++11 spec, invoke your compiler with -std=c++11.
I'd just like to see the tooling and compilers improve for Ada. Alire is fantastic, but It's still a huge struggle to import C headers (gcc -fdump-ada-spec is the best thing so far) and find the required linker flags for a library.
> One of the requirements for qualifying such a toolchain is to describe how the compiler should behave when compiling source code, which means a specification of the programming language.
Doesn't the reference implementation of the compiler already qualify as such a specification?
In my understanding, this alone isn't good enough. You need to:
1. Say what you expect an implementation to do (the spec)
2. Check that the tool does what you say in step 1 (the implementation)
Part 2 on its own is not enough. The reference implementation is an implementation, but you still need an explicitly different reference to check said implementation against.
That it's a reference implementation means it's an appropriate thing to derive 1 from, which is of course what's happening here.
And rustc's massive test suite is very, very helpful in demonstrating 2.
Loosely speaking, yes -- and the specification is going to follow what the reference compiler does.
But the compiler mixes together spec-relevant details with implementation details (what bits go where). A written formal spec would only have the spec-relevant details.
Especially because the whole point is to qualify the compiler. You can't say "well this code compiles under the reference compiler but in production we compile it with this other compiler," as that's like, kinda the whole point of qualification: to qualify the tools being actually used.
> Translation validation can be used even with a compiler that sometimes generates incorrect code, as long as this incorrect does not manifest itself for a given program. Depending on the input program the translation validation can fail (because the generated code is wrong or the translation validation technique is too weak to show correctness). However, if translation validation succeeds, then the compiled program is guaranteed to be correct for all inputs.
I don't remember where I read this, but I think there are some examples in practice where instead of proving the correctness of some optimizations of existing C compilers (which is a codebase that keeps evolving), there have been situations where the correctness was ensured using program equivalence checking. So you'd "freeze" the reference compiler and the equivalence checker (which would be qualified), and pair that with an upstream compiler with sophisticated optimizations. So long as the equivalence checker keeps passing, you're golden.
> I don't remember where I read this, but I think there are some examples in practice where instead of proving the correctness of some optimizations of existing C compilers (which is a codebase that keeps evolving), there have been situations where the correctness was ensured using program equivalence checking.
This is what CompCert does (or at least did when I last looked inside CompCert) for register allocation; the graph colouring algorithm is unverified, but a verified validator is used to check the colouring is correct.
Whilst this is relatively straightforward for something like graph colouring, it is substantially more difficult for comparing the outputs of two different compilers, which I think is what you are suggesting here?
There was some work to do a translation validation for the entire RTL -> LTL pass of CompCert, but I'm not sure how much progress was made or if this is currently used.
I guess to me I see that as isomorphic; you're still qualifying the output, even if there's an intermediate step involved. The path towards qualifying the output of the sophisticated compiler may be more indirect, but you're still doing it.
You could produce final binaries for QA & shipping with the qualified one, and develop with something faster? Sort of analogous to 'optimisations' and debug symbols.
Sure, but given that the qualified one here still needs to be good enough to produce production ready code, you're not really saving yourself any work.
especially given that many use cases for this sort of thing are either embedded or close enough, and that Rust really relies on optimizations to produce good code, you need more than a bare minimum "it works" compiler to make the whole thing go.
My team's embedded project sets opt-level 1 even in debug builds, as no optimizations makes things impossible, flash size wise, and we're not even working with particularly small MCUs.
Depends on the details of how you're formalizing your verification and what tools you have at your disposal. It could be immensely useful as a stand-in for a specification since it reduces a large class of verification to "does it do the same thing as the reference compiler."
E.g. it's a huge amount of work to exhaustively formally specify the Rust type system (as opposed to just specify it in English). However, if you already have an assumed correct, but slow, type checker then your verification effort can boil down to just "does my compiler have a type error if [and only if if you care about more than just safety properties] the reference compiler has a type error?" That's still an enormous amount of work to verify, but is comparatively trivial to specify.
From a QA standpoint, one could argue that a set of reference unit tests should qualify as a specification, but certainly not the implementation itself.
I'm personally pretty excited to see where this goes. It could be the best way for gccrs to version itself. There are some immediate aspects I am pretty interested in relation to the spec:
1. Method resolution
2. Unstable?
In particular is it going to define lang items?
3. DST's
Rust has strange things like:
```
let a:&str = "str";
let b:&str = &"str";
```
Which is valid since str is a DST. Although slices and dyn trait are DST they have more strict rules there.
https://rustacean-station.org/episode/067-quentin-ochem-flor...