Hacker News new | past | comments | ask | show | jobs | submit login

> You can sort of think of MIR as "core Rust", in that it's the final, desugared form of everything.

Though I don't want anyone to get the impression that MIR is a source-compatible subset of Rust; it's a pretty different thing in its own right. (Worth clarifying because one could imagine a "fully-desugared" maximally-explicit subset of Rust, where e.g. all method calls are maximally disambiguated via UFCS, all types are explicitly annotated, no lifetimes are elided, all macros are expanded, etc.)




That is what I imagined when the Karger/Thompson attack came up with regards to Rust. The simplest cheat would be mapping low-level Rust to a safe subset of C. Automatically or hand-convert the source for Rust compiler. Then, run that through CompCert. A Csmith-style program run through both versions of Rust compiler might also catch errors in one or both. One might also use the C tooling to find errors in the Rust compiler or apps. And so on.

First step that was necessary would be converting the Rust to its lowest-level form. That sounds like the fully-desugared" form you describe.


I admit that I'm quite curious about determining exactly what features could be left out of maximally-explicit Rust, because it would determine the obvious MVP for an alternative compiler, that could technically compile all Rust code with the caveat that you would need to first losslessly and automatically transform the source (which theoretically shouldn't be too hard to add the Rust compiler, e.g. it already has the capability to print Rust source post-macro expansion). Without macros, you'd need neither an implementation of pattern macros nor syntax extensions; without inferred types, you'd need neither a trait resolution engine nor anything of Hindley-Milner; with every identifier fully qualified you wouldn't need any name resolution rules... and we've already established that a borrow checker is unnecessary to implement if your goal is to merely compile Rust code that has already been typechecked. All that together means that you could have a legitimately useful alternative backend with so, so much less work on behalf of the implementor!


This is why I wish that there was a higher-level, maintained api for the compiler internals, to make it easier to do these kinds of experiments


This is kind of what mrustc does? It's a Rust-to-C compiler in C++; one which assumes the Rust program is correct (i.e. passes type/borrow check).

But you still need to resolve types to do dispatch, so that's a nontrivial amount of work.

mrustc can currently compile rustc, but the produced rustc doesn't pass the entire rustc testsuite (yet).

A colleague of mine was considering writing a Rust-to-C++ compiler that was similar, but offloading most of the dispatch/resolution work onto C++. This is actually possible, you can turn method dispatch and autoderef into template resolution. You can do stuff to fake type inference too if you know the program is correct already. This is much harder however and I'm not yet sure if it's 100% possible without doing some typechecking in the Rust-to-C++ compiler itself.

You could however take rustc --unpretty=typed (or whatever that option is these days) output and transform that really easily.

This requires you to be able to independently verify that the two ASTs are equal if you resugar, because you can't trust rustc's output for this. In fact, the poc trusting trust attack I wrote[1] would still go under the radar here.

Verifying ASTs as semantically equal is much less work. But there's a loophole here, it is possible to add type annotations to type inference'd Rust to get different behavior. This is because (among other things) integers are inferred more loosely (an uninferable integer type is defaulted as u32). It's possible that a trusting trust attack would be able to propagate itself merely by flipping around the results of inference.

Even if we looked out for that, you still have the problem of dispatch, where a backdoored rustc could change the method being dispatched to be a different trait. Now this isn't something that would work if you assume that the original code was code which compiled fine with rustc (because rustc complains when there's unqualified ambiguity). But we can't actually trust the original compiler to have handled this correctly either!

In both cases there would be need to be traces of weird code in rustc for this to work, but it might be possible to hide this.

(This is also somewhat a problem for mrustc, but to much a lesser degree)

[1]: http://manishearth.github.io/blog/2016/12/02/reflections-on-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: