Formal Mechanised Semantics of CHERI C: Capabilities, Undefined Behaviour

snvzz · 2024-04-18T22:47:02

>CHERI was first prototyped as extensions of MIPS and RISC-V; it is currently being evaluated by Arm...

Note there's also a RISC-V effort[0][1] to write an extension specification proper.

0. https://jira.riscv.org/browse/RVS-2141?src=confmacro

1. https://wiki.riscv.org/display/HOME/RISC-V+Specification+Sta...

reply

taffronaut · 2024-04-19T08:53:49

Here's a link to the Arm program: https://www.morello-project.org/

nickpsecurity · 2024-04-18T23:25:35

Yeah, it was built on BERI:

https://www.cl.cam.ac.uk/research/security/ctsrd/beri/

reply

teleforce · 2024-04-19T01:38:23

Does this CHERI platform enable memory safety for programming languages and all the facilities provided by borrow checking language like Rust is not necessary?

CJefferson · 2024-04-19T06:30:11

No, there are actually quite a few things CHERI does that aren't the borrow checker's job, and some things the borrow checker does that CHERI doesn't help with.

In C and C++, if you write x[4] for an array with only 2 members, you get undefined behaviour. In Rust this causes a 'panic', but that isn't a 'borrow checker' thing, that's because in Rust all array accesses are bounds checked by default.

You can do un-bounds checked array accesses in Rust, you can do bounds-checked accesses in C++ (for std::vector at least), it's just that the languages chose different defaults.

One thing the borrow checker does is help write parallel code by stopping multiple threads writing to the same memory at the same time. CHERI doesn't stop that at all.

reply

nextaccountic · 2024-04-19T02:18:55

No. CHERI doesn't help with data races, for example. Also it doesn't help with compiler optimizations that exploit UB to make code faster.

Unfortunately low level languages are kind of addicted to optimizations that only work for languages with UB. There's too much performance being left at the table if you create a programming language entirely without UB.

There's no law of nature, however, that says this must be the case forever. You could have a theorem prover prove useful properties about programs (and them feed them to the optimizer) and refrain from "optimizing" if you can't 100% prove that this will not break.

reply

touisteur · 2024-04-19T11:15:02

This is one actual use of SPARK for me. Prove the absence of runtime errors (and if necessary add preconditions) then let the compile optimize to hell.

Another example on tweetnacl: https://blog.adacore.com/sparknacl-two-years-of-optimizing-c...

reply

Peter_Sewell · 2024-04-19T05:39:09

CHERI enables a rather high level of memory safety. It uses dynamic hardware checks rather than the safe-Rust static checks, which means that existing C/C++ code can often be ported to CHERI C/C++ with minor changes. Of course, as others note, "memory safety" is not a simple single thing, and there are certainly some cases that CHERI C/C++ don't depend on (as noted in this paper by Vadim Zaliva and others in our group). But in examples like this one from another comment:

    struct buffer {
          char *data;
          size_t capacity;
          size_t length;
        }

the pointer 'data' will in CHERI have to be a valid capability, not just a virtual address, to permit access. It should normally have been instantiated with the correct bounds from the appropriate allocation, separately from the 'length' field, so the hardware will do the right check.

PhilipRoman · 2024-04-19T05:13:19

EDIT: Apparently CHERI can handle this scenario, please ignore the rest of the comment

Memory safety is not a binary choice. Consider this typical structure:

    struct buffer {
      char *data;
      size_t capacity;
      size_t length;
    }

In fact it doesn't even have to be C, you could write it in Java, Python, etc. Lets say you decide to reuse such a buffer multiple times. From the C semantics point of view, there is nothing wrong with accessing elements past length. But it can easily lead to a heartbleed style vulnerability. There is probably some escape hatch for CHERI to mark memory as uninitialized, but I'm not very familiar with it. In Rust you can probably give out slices which will ensure bounds checking (or eliminate it if possible). The only correct way towards safety is languages which enable the programmer to add additional restrictions, like Rust or Frama-C.

Anyway, this is mostly just theoretical musing, in practice CHERI will catch lots of issues. But I don't think it is correct to call it "memory safety".

reply

vzaliva · 2024-04-19T06:00:35

In CHERI, the `data` pointer in your example is not merely a 64-bit pointer (assuming ARM Morello architecture) but a special 128-bit data structure called a `capability`, which encodes within it the bounds of the memory block allocated to it. The hardware will prevent any out-of-bounds access, even if you manage to bypass any C language type checks, for example, by using inline assembly. So, this offers a pretty strong level of memory protection.

PhilipRoman · 2024-04-19T06:11:58

Yes, but the capability will be enforcing the bound w.r.t "capacity", not "length" as that is the allocated size. From CHERI's point of view "length" is just some random user defined variable with no effect on bounds checks.

Peter_Sewell · 2024-04-19T06:24:28

If you want to narrow the bounds of `data` on reuse, you can do easily do that in CHERI C (you'd need to also keep the larger capability somewhere to rederive the `data` capability from, either in this struct or elsewhere, depending where the allocation come from).

PhilipRoman · 2024-04-19T06:57:03

Oh that is great news, I was not aware that CHERI was this powerful. Interesting if this could be abused by JIT compilers to provide free runtime bounds checks, kind of like the free null pointer checks done by catching SEGV.

Vecr · 2024-04-19T02:06:14

Not in the near future, because compilers will still emit weird code in the presence of UB. The hardware enforces more stuff, and it makes it more likely you'll get a crash instead of the program continuing in a bad state, but it absolutely does not eliminate the general UB or "implementation defined" problems.

kryptiskt · 2024-04-19T11:56:12

It's different. Borrow checking is statically enforced at compile time. CHERI applies at runtime, so it can't provide peace of mind that the code is safe, just that it will fail if it tries to do unsafe stuff.

saagarjha · 2024-04-19T07:22:55

No. CHERI does not have a straightforward way of protecting against UAF.

kwitaszczyk · 2024-04-19T10:41:32

The CHERI ISA does not provide temporal safety features but it allows building temporal safety mechanisms in software. CheriBSD 23.11 includes user-space heap temporal memory safety that is enabled by default [0]. The paper "Cornucopia Reloaded: Load Barriers for CHERI Heap Temporal Safety" at ASPLOS 2024 will describe the current implementation [1].

For embedded systems, the CHERIoT [2] platform implements hardware-assisted temporal memory safety without MMU.

[0] https://ctsrd-cheri.github.io/cheribsd-getting-started/featu...

[1] https://www.asplos-conference.org/asplos2024/main-program/ab...

[2] https://cheriot.org/papers/2023-micro-cheriot-uarch.pdf

reply

Peter_Sewell · 2024-04-19T09:48:09

There's increasingly mature work on CHERI temporal memory safety, eg https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/2020oa... and more recent versions.