Hacker News new | past | comments | ask | show | jobs | submit login
What Is Rust's Unsafe? (2019) (nora.codes)
140 points by luu on April 10, 2022 | hide | past | favorite | 96 comments



Crucially, unsafe is also about the social contract. Rust's compiler can't tell whether you wrote a safety rationale adequately explaining why this use of unsafe was appropriate, only other members of the community can decide that. Rust's compiler doesn't prefer an equally fast safe way to do a thing over the unsafe way, but the community does. You could imagine a language with exactly the same technical features but a different community, where unsafe use is rife and fewer of the benefits accrue.

One use of "unsafe" that was not mentioned by Nora but is important for the embedded community in particular is the use of "unsafe" to flag things which from the point of view of Rust itself are fine, but are dangerous enough to be worth having a human programmer directed away from them unless they know what they're doing. From Rust's point of view, "HellPortal::unchecked_open()" is safe, it's thread-safe, it's memory safe... but it will summon demons that could destroy mankind if the warding field isn't up, so, that might need an "unsafe" marker and we can write a checked version which verifies that the warding is up and the stand-by wizard is available to close the portal before we actually open it, the checked one will be safe.


The social contract is manageable precisely because the unsafe subset of typical Rust codebases is a tiny fraction of the code. This is also why I'm wary about expanding the use of `unsafe` beyond things that are actually UB. Something that's "merely" security sensitive or tricky should use Rust's existing features (modules, custom types etc.) to ensure that any use of the raw feature is flagged by the compiler, while sticking to actual Safe Rust and avoiding the `unsafe` keyword altogether.


I agree with this mindset a lot. One example I like about how to deal with exposing a "memory and type safe but still potentially dangerous" API is how rustls supports allowing custom verification for certificates. By default, no such API exists, and server certificates will be verified by the client when connecting, with an error being returned if the validation fails. However, they expose an optional feature for the crate called "dangerous_configuration" which allows writing custom code that inspects a certificate and determines for itself whether or not the certificate is valid. This is useful because often you might want to test something locally or in a trusted environment with a self-signed certificate bit not want to actually deploy code that would potentially allow an untrusted certificate to be accepted.


When I read posts like yours, I wish unsafe had a different name like "human invariant" or whatever.

Something that would make it harder to water down the meaning of the clearly defined unsafe keyword to suddenly mean something else.

Using the unsafe keyword to mark a function as "potentially dangerous" is just wrong.

Just prefix your functions with something like "dangerous_call", but don't misuse unsafe!


This is a complete misuse of the unsafe as language concept in high integrity computing.


This person isn't wrong. A lot of serious Rust users don't agree with what the GP is suggesting. `unsafe` has an explicit meaning: the user must uphold some invariant or check something about the environment, otherwise it is memory unsafe.

I have several times in code review prevented people from marking safe interfaces as "unsafe" because they are "special and concerning", overloading the usage of unsafe is itself dangerous.


True, but I think allowing potentially UB-invoking code to not use "unsafe" (e.g. because the use is in the context of FFI, so the unsafety is thought to be "obvious" and not worth marking as such) might be even less advisable. This makes it harder to ensure the "social rule" mentioned by GP, that every potential UB should be endowed with a "Safety" annotation describing the conditions for it to be safe.


Your comment gave me an idea for a lint that might help prevent those mistakes. Right now rustc flags `unsafe {}` with an "unused_unsafe" warning. However it doesn't warn for `unsafe fn foo() {}`. Maybe it should.


I think as described you would get false positives, because `unsafe fn foo() { body that performs no unsafe operations }` can be unsafe to call if it interacts with private fields on datastructures used by safe (to call) functions that perform unsafe operations... I expect you would end up with a reasonably high number of false positives.

For an example, consider Vec::set_len in the standard library. Which only contains safe code, but lets you access uninitialized memory and beyond the length of your allocation by modifying the length field of vector: https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#1264

You might be able to fix this with a lint that looked at a bit more context though, `unsafe fn foo()` in a module (or even crate) with no actually unsafe operations is very likely wrong. Likewise `unsafe fn foo()` which performs no unsafe operations and only accesses fields, statics, functions, and methods that are public.


The Rust Vec type has an unsafe function called `set_len` that changes the length of the vector without checking whether the new length is in bounds, or whether the memory containing any new values is initialized. The body of the function is the following:

    self.len = new_len;
No unsafe operations in sight. Should the compiler emit a warning here?


This made me think about what you could potentially do to get a similar sort of thing using the type system instead of the built-in effect of unsafe.

Seems like you could create a sort of userland-unsafe by using a closed trait[1] and requiring its use on a 'dangerous' method or struct or whatever:

   mod my_unsafe {
      pub trait MyUnsafe {}
   }
   pub struct MyUnsafe;
   impl my_unsafe::MyUnsafe for MyUnsafe;

   pub fn dangerous_function<U: my_unsafe::MyUnsafe>() {}

   fn main() {
     dangerous_function::<MyUnsafe>()
   }
Obviously this doesn't let you do a block of unsafe without having to repeat it like `unsafe {}` does, but it doesn't leave you much room to do the dangerous things without the shrinkwrap agreement either (and turbofish are so ugly at least for me they'd be a deterrent).

That said I find the named use-case kind of weird. The whole point of the library is to do these unsafe things, so it's kind of silly to be like "don't forget it's dangerous!"


Oh that's just the suggestion I mentioned here https://news.ycombinator.com/item?id=31009572 (I saw it on some internals.rust-lang.org thread), but with no new syntax. It also mirrors the usage of existing beyond-UB safety markers like UnwindSafe https://doc.rust-lang.org/std/panic/trait.UnwindSafe.html but with inverted polarity (instead of asserting something is safe, it asserts that the programmer acknowledges the new kind of unsafety)

I think an interesting part of unsafe Rust is the interplay between positive and negative polarities (unsafe/safe fn vs unsafe blocks basically) and I think that adding new syntax is the way to leverage this kind of idiom in the new kind of unsafe


Real formal verification is clearly a step up from rust's meaning of "safe", but I don't think it's wrong to try to add another rung to the verification ladder at a different height. Verification technologies have a lot of space to improve in the UX department, here Rust trades off a some verification guarantees for a practical system that is still meaningful.


Real formal verification in Rust is basically waiting for a proper formal model of Rust semantics (including semantics of "unsafe"). This is one of the most unappreciated things about Rust's future potential, AIUI; the only language in common use that supports both systems-level programming and formal proof is ATS.


Eh. You can write, and people have written, formal proofs about C code too. In fact, the difference between Rust and C in terms of safety isn't very large if you have an infinite budget to put toward proving all the code that you write correct.

The safety benefits of Rust appear when you aren't willing to formally prove all the code you write correct. That's because you only have to prove the unsafe code, plus the memory model, correct in order to guarantee memory safety for the entire program. This is less burdensome than in C, where to do the same you have to prove correctness properties about the specific code that makes up the whole program. Rust makes it easier to prove certain properties about the program (far easier in the case of memory safety), but it was always possible.


I’m doing my PhD on the formal verification of Rust, and while you’re right that safe code provides a lot of informal advantages it also dramatically simplified form reasoning.

In particular, the dirty secret of C verifiers is that they don’t handle pointers all that well. Either you find yourself doing a lot of manual proof work or you have to dramatically simplify the memory model.

In contrast, when verifying safe Rust, the rules of the borrow checker allow us to dramatically simplify the verification work. All of a sudden verifying a manual memory program with pointers (borrows) becomes as simple as verifying a basic imperative language. I’ve been working on a tool: https://github.com/xldenis/creusot to put this into practice

On the other hand, the moment you dive into unsafe, all bets are off and you find yourself wading through the marshes of (weak) memory models with your favorite CSL as your only friend.


> I’ve been working on a tool: https://github.com/xldenis/creusot to put this into practice

Note that there are other tools trying to deal with formal statements about Rust programs. AIUI, Rust developers are working on forming a proper team or working group for pursuing these issues. We might get a RFC-standardized way of expressing formal/logical conditions about Rust code, which would be a meaningful first step towards supporting proof-carrying code directly within Rust.


> Eh. You can write, and people have written, formal proofs about C code too.

Not without a formal model of C, and the C standard is only an informal, natural-language text. Rust having memory safety as its express goal (which is basically table stakes for any sort of workable language semantics) means that it's at least realistic to think about a formal semantics for Safe Rust. Then you "just" need to deal with the uncomfortable reality that lots of Safe Rust facilities actually bottom out into Unsafe Rust, which is why it turns out you must care about its semantics too. But the hope is ultimately that the small portion of real-world Rust codebases that's Unsafe Rust might not "infect" the Safe Rust to the point of making verification as practically unworkable as in C.


There are formalization of C, e.g. compcert, which also compiles to machine code in a way that should guarantee proofs all the way though execution (assuming the machine modeling is correct).

It seems like it is a long way off from being possible to do that for rust.


Nope, Ada/SPARK and Frama-C, both with plenty of experience in high integrity systems deployments into production than Rust, currently.


That isn't really true. "unsafe" specifically means "memory-unsafe". The checking is left to the humans, and so is the justification, but the meaning is not.

See https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html

Check out also the "Rust considers it safe to" section of https://doc.rust-lang.org/nomicon/what-unsafe-does.html


I feel like there's plenty of instances where unsafe is used in rust to indicate library level UB, without actually being memory-unsafe. (Like, where using a method in a certain way may violate a runtime assumption of a library, causing application level UB).

For example, String::from_utf8_unchecked[1]. There's a comment in the documentation trying to justify why invalid strings could cause memory unsafety, but its pretty weak:

> This function is unsafe because it does not check that the bytes passed to it are valid UTF-8. If this constraint is violated, it may cause memory unsafety issues with future users of the String, as the rest of the standard library assumes that Strings are valid UTF-8.

Like, sure, we could make that argument for any runtime invariant. "Look, this is totally memory safe but if you call this method we assume this constraint is valid. We're marking this unsafe anyway because we want people to take care in using it. So uh, its memory-unsafe because violating this constraint could cause memory unsafety issues in the future or something? Yeah that'll do."

It seems like lots of folks quietly want unsafe to mean "this method skips runtime checks", with no specific grounding in memory-unsafety. And the line between those two ideas is super blurry in practice, even in the standard library.

[1] https://doc.rust-lang.org/std/string/struct.String.html#meth...


It is perfectly reasonable that `from_utf8_unchecked` is unsafe because the sequence of operations `from_utf8_unchecked(array).chars().next()` will call [1] and can trigger memory unsafety with the `unwrap_unchecked` call if the array is not valid utf-8. This means that at least one of the three functions must be marked unsafe.

[1] https://github.com/rust-lang/rust/blob/027a232755fa9728e9699...


Ensuring that values actually are a specific type isn't directly memory safety, but if you don't do it then you can't even have an exhaustive match without the danger of your program going wild.

I see enforcing utf8 as about on par with enforcing that a boolean be 0 or 1, and not 5.

How do you feel about this list of what's unsafe and undefined? https://doc.rust-lang.org/nomicon/what-unsafe-does.html


> I see enforcing utf8 as about on par with enforcing that a boolean be 0 or 1, and not 5.

UTF8 correctness is an invariant of the code. Its not something the compiler understands or cares about it.

But really, I get it! My point is that rust is rife with people marking functions "unsafe" when they mean a method will cause bugs / "runtime UB" if used improperly. Even the standard library does this, so it feels a bit rich to cry foul when 3rd party libraries (ab)use "unsafe" in the same way.

I certainly use unsafe like that, in order to mark methods which, if misused, will violate internal invariants of my code.

How does that list of unsafe features justify the unsafe marker on String::from_utf8_unchecked?


Well how do you feel about the comparison to booleans?

If I had a way to set some booleans "unchecked", in a way that could make the numerical value be 5, would you label it unsafe?

I really think it should be unsafe, because it's crazy to make ifs or switches on booleans be non-exhaustive, and it's also crazy to have an "invalid boolean" path anywhere one is used.

When you have an encapsulated value, with special accessors to make sure it's always in range, then you don't have to have an unchecked setter function at all. But if you do add one, I think it makes sense to make it unsafe. That function is basically allowing a blind memcpy over a piece of data. It ties into safety pretty strongly. That kind of access is a lot like dereferencing a raw pointer, and not making it unsafe means you allow a lot of those "invalid value" cases in the link to be possible in safe code.


I get that this was just an example, but a boolean 5 would actually be language-level UB that could have very real correctness implications!

The Rust compiler is allowed to (and sometimes required to) use the invalid values of a child value to represent other cases of an enum. For example, this is how it optimizes `Option<&T>` into "pointer where null is None and any other value is Some".

This would technically also allow it (I don't know whether it currently does this..) to optimize the following enum:

    enum Foo {
        A(bool),
        B(bool),
        C(bool),
    }
into the following single-byte representation:

    A(false) => 0,
    A(true) => 1,
    B(false) => 2,
    B(true) => 3,
    C(false) => 4,
    C(true) => 5,
With this representation, `A(5 as bool)` would be impossible to distinguish from `C(true)`, which would be complete and utter nonsense.


> I get that this was just an example, but a boolean 5 would actually be language-level UB that could have very real correctness implications!

Yes, that's why I used it as the example.

And you don't need anything that complicated, either. A simple if or switch statement might be optimized by the compiler for 0 and 1 and jump into random code for 5.


I agree. I think "unchecked" booleans which could contain 5 should be unsafe too.

I'd be happy to extend the definition of "unsafe" to mean "If you misuse this, you may violate some internal invariants of the library. And that may cause unexpected bugs."

Thats a broader, but in my opinion much more clear definition of unsafe, which covers how the keyword is actually used in actual code. (In std::String and elsewhere).

I've implemented high performance b-tree and skiplist implementations in rust. There's plenty of functions in both libraries which I've marked as unsafe because if you use them carelessly, you'll violate some internal invariants. Will the library break as a result? Yes. Will the resulting bugs include memory corruption? I really have no idea, and I don't really care enough to go in and test that. So I've marked the methods unsafe, and provided safe APIs for consumers to use instead.

Do you think this is an appropriate use for unsafe? If not, I'm curious to hear why.


I think it depends on the kind of invariant. If it leads to undefined behavior that wouldn't already be possible, then mark it unsafe or change the unsafe code to eliminate it. If it just makes it act badly, like dropping a node, that's likely not a good place for unsafe.

Or more simply: Worry about all kinds of undefined behavior. That makes many issues easier to find, and you don't need to chain on additional logic to figure out how it might corrupt memory.


I would argue it’s language-ub to convert an invalid slice into into a String, it’s bubbling up a requirement of str which relies on the contained utf-8 to be well-formed for memory safety.


the key thing here is that if you know that no safe code contains invalid utf8, it is safe to do unsafe indexing into tables that are only sized to deal with valid utf8. as such, this works in practice lead to ub in otherwise correct code, so it should be marked unsafe.


> Like, sure, we could make that argument for any runtime invariant.

One difference is that Rust has a separate type for String-like objects that do not respect the invariant, namely Vec<u8>. So those who wish to write memory-safe code acting on arbitrary bytes can simply use that. Also, it's quite normal for an invariant defined entirely in Safe Rust to impact memory safety in a very real sense; consider Send and Sync. These are seemingly arbitrary labels, but their semantics is nonetheless quite well defined.


> One use of "unsafe" that was not mentioned by Nora but is important for the embedded community in particular is the use of "unsafe" to flag things which from the point of view of Rust itself are fine, but are dangerous enough to be worth having a human programmer directed away from them unless they know what they're doing.

Rust gave up this notion when they decided that mem::forget could, in fact, be safe, since even though it was initially marked unsafe it could be trivially implemented in safe code using only safe constructs from the stdlib.

We need another keyword or concept to refer to unsafeness that isn't related to UB. Some people suggested tagging unsafe with something like

    // declare SummonsDemons as some kind of safety market that goes beyond preventing UB

    unsafe(SummonsDemons) fn f() {
        // summons demons
    }

    fn safecode() {
        // SAFETY: the demons are cool this time
        unsafe(SummonsDemons) { f() }
    }


> One use of "unsafe" that was not mentioned by Nora but is important for the embedded community in particular is the use of "unsafe" to flag things which from the point of view of Rust itself are fine, but are dangerous enough to be worth having a human programmer directed away from them

I thought there were subtle language differences such that if you took ordinary, perfectly valid as safe rust code and marked it as unsafe the result could be incorrect? Am I mistaken-- maybe that was some pre-1.0 property of the language and it's actually okay to go peppering around unsafe for typechecking like usage?

Separately, if people are creating unsafe interfaces to protect them against uses that fail to uphold their required invariants it would probably be really useful if they could be tagged with specific required unsafty-capabilities such that if a function requires foo-stability and bar-stability you don't accidentally call it with code that only guarantees foo-stability under a mistaken impression that foo-stability was all the unsafty of the function was related to.


Would you be able to provide a real example of that HellPortal thing? I'm not really following


I'm assuming "real example" means of such unsafe-means-actually-unsafe behaviour in embedded Rust, as opposed to a real example of summoning demons?

For example volatile_register is a crate for representing some sort of MMIO hardware registers. It will do the actual MMIO for you, just tell it where your registers are in "memory" and say whether they're read-write, read-only, or write-only just once, and it provides the nice Rust interface to the registers.

https://docs.rs/volatile-register/0.2.1/volatile_register/st...

The low-level stuff it's doing is inherently unsafe, but it is wrapping that. So when you call register.read() that's safe, and it will... read the register. However even though it's a wrapper it chooses to label the register.write() call as unsafe, reminding you that this is a hardware register and that's on you.

In many cases you'd add a further wrapper, e.g. maybe there's a register for controlling clock frequency of another part, you know the part malfunctions below 5kHz and is not warrantied above 60kHz, so, your wrapper can take a value, check it's between 5 and 60 inclusive and then do the arithmetic and set the frequency register using that unsafe register.write() function. You would probably decide that your wrapper is now actually safe.


So, I peeked into the documentation of volatile_register. The unsafe is there for a clear reason: the compiler can't verify that you aren't creating a mapping to some memory location that is used by the program itself. If you are allowed to safely create a mapping to, for example, a local variable and safely modify it using `register.write()`, you have UB, and that's all in safe code!

So, this isn't a case that the `unsafe` is there just as a warning lint, it has an actual meaning, protecting the memory safety invariants.


> I'm assuming "real example" means of such unsafe-means-actually-unsafe behaviour in embedded Rust, as opposed to a real example of summoning demons?

That was what I meant, thanks for the answer! Though if you have an example of the other thing I'd be open to that too


Similar to the sibling, stuff where you're dealing with parallel state in hardware, like talking to a device over i2c or something, where you know certain things are supposed to happen but you don't, like, know know.


I'm not convinced that it's a great use of Rust's `unsafe`, but since you want an example... dealing with voltage regulators maybe? Where an invalid value put into some register could fry your hardware? There's a ton of such cases in embedded.


You could probably make the case that any function that might physically destroy your memory is memory unsafe :)


Imagine the code screaming

"Rip and tear, until it's done!"


"From Rust's point of view, "HellPortal::unchecked_open()" is safe, it's thread-safe, it's memory safe... but it will summon demons that could destroy mankind if the warding field isn't up, so, that might need an "unsafe" marker and we can write a checked version which verifies that the warding is up and the stand-by wizard is available to close the portal before we actually open it, the checked one will be safe."

"The only thing they fear is you"

playing in the background.


It is a very cool abstraction that makes lots of things possible, which are (to my understanding) impossible to do in other languages without the explicit safety contract.

Like this: https://www.infoq.com/news/2021/11/rudra-rust-safety/, and I quote: "In C/C++, getting a confirmation from the maintainers whether certain behavior is a bug or an intended behavior is necessary in bug reporting, because there are no clear distinctions between an API misuse and a bug in the API itself. In contrast, Rust's safety rule provides an objective standard to determine whose fault a bug is."

People bring up `unsafe` Rust as an argument against the language, but to me it appears to be an argument `for` it.


> People bring up `unsafe` Rust as an argument against the language

Those people usually don't understand what `unsafe` is.


I especially like the following quote in that context:

> The biggest failure in Rust‘s communication strategy has been the inability to explain to non-experts that unsafe abstractions are the point, not a sign of failure.

Appeared on TWiR: https://this-week-in-rust.org/blog/2021/10/20/this-week-in-r....


My only complaint is I wish they didn't name it `unsafe` and instead named it something like `compiler_unverifiable` so that people could more properly understand that we can make safe abstractions around what the compiler can't verify.


I like the word `unsafe`. It's a nice red flag to newbies that "you almost certainly shouldn't use this" and once you have enough experience to use `unsafe` well you'll know its subtleties well enough.


I just saw a tweet from Esteban Küber[0] the other day that made me rethink a silly thing I did:

"I started learning Rust in 2015. I've been using it since 2016. I've been a compiler team member since 2017. I've been paid to write Rust code since 2018. I have needed to use `unsafe` <5 times. That's what why Rust's safety guarantees matter despite the existence of `unsafe`."

For me, I was being a bit lazy with loading some config on program startup that would never change, so I used a `static mut` which requires `unsafe` to access. Turns out I was able to figure out a way to pass my data around with an `Arc<T>`. I think either way would have worked, but I figured I should avoid the unsafe approach anyway.

[0] https://twitter.com/ekuber/status/1511762429226590215


Yeah if you don't need mutation after initialization is done, Arc<T> is a good option for sharing. Lazy<T> (https://docs.rs/once_cell/latest/once_cell/sync/struct.Lazy....) is also nice, especially if you want to keep treating it like a global. I believe something like Lazy<T> is eventually going to be included in the standard library.



Don't ever use `static mut`. It's dangerous[1]. In nightly you now have `SyncUnsafeCell`[2] to avoid it.

[1] https://github.com/rust-lang/rust/issues/53639

[2] https://github.com/rust-lang/rust/pull/95438


I’ll be even lazier (heh) and just straight up leak memory for true once-set read-only config values.

https://github.com/shepmaster/stack-overflow-relay/blob/273a...


I don't think newbies using it would be a problem. There is an issue with people/companies evaluating Rust and stopping when they see `unsafe` without looking much further into it.


I believe the term "unsafe" in this sense predates Rust. Haskell's GHC has unsafePerformIO [1], which is very similar conceptually. Generally used to implement an algorithm in a way where it can be pure/safe in practice, but where this cannot be proved by the type system.

[1] https://stackoverflow.com/questions/10529284/is-there-ever-a...


If I were to go back and do it again I'd propose to rename unsafe blocks to something like `promise`, to indicate that the programmer is promising to uphold some invariants (the keyword on function declarations would still be `unsafe`). (Of course the idea of a "promise" means something different in JavaScript land, but I'm not worried about confusion there.)


While we're at it, I also would have preferred `&uniq` over `&mut`.


And from the great Steve Klabnik himself: StrBuf instead of „String“


I still like `trust_me`, maybe a bit unprofessional in some contexts though.


Jon Goodwin's Cone [0] language had a nice idea of calling it "trust", as in, "Compiler, trust me, I know what I'm doing"

[0] https://cone.jondgoodwin.com/coneref/reftrust.html


    hold_my_beer {
    }


yeet {

}


Or just `unchecked`. It is similarly short but holds a similar meaning. Of course `unchecked_by_the_compiler` would be more accurate.


That's inaccurate, because Rust still performs all the usual checks inside of unsafe blocks. The idea that unsafe blocks are "turning off" some aspect of the language is a persistent misconception, so it's best to pick a keyword that doesn't reinforce that.


That’s right, the keyword enables additional operations, namely those that can’t be checked — which one might even call “unchecked operations” to justify the keyword.

My problem with the word “unsafe” is that many people seem to think unsafe code blocks necessarily have security issues. Hence the bullying of library authors even in cases where the authors did their homework.

Anything about the mechanics of the implementation is a second-order detail compared to that, as far as I’m concerned.


Shrug. I really don't think unchecked is "wrong" in spirit, since adding an unsafe block makes it trivial to write code which ignores the usual boundary and pointer lifetime checks, which is not very different in practice from turning off checks. Also unsafe code is usually wrong, and deeply difficult to write correctly, much more so than C.

Also, I dislike the word "unsafe" since "unsafe code" is easily (mis)interpreted to mean "invalid/UB code", but "invalid/UB code" is officially called "unsound" rather than "unsafe". Unsafe blocks are used to call unsafe functions etc. (they pass information into unchecked operations via arguments), and unsafe trait impls are used by generic/etc. unsafe code like threading primitives (unsafe trait impls pass information into unchecked operations via return values or side effects).

unchecked would make a good keyword name for blocks calling unchecked operations, but I'm not so sure about using it for functions or traits.


Technically, code that contains UB is just UB, and unsound code is code that allows you to trigger UB using safe code only.

For example, this is UB:

  unsafe { std::hint::unreachable_unchecked() };
And this is an unsound code, that may not contain any UB if I call `unsound()` with `false`:

  fn unsound(trigger: bool) {
      if trigger { unsafe { std::hint::unreachable_unchecked() }; }
  }


I like the sound of “trusted”.


I think the benefit of the name "unsafe" is that it immediately tells newer users that the code inside has deeper magic than they may be comfortable using. Where "trusted" is what the writer attests to the compiler, "unsafe" is what writer warnings to a future reader.


The unsafe concept exists since early 60's in systems programming languages.

The real issue here is Rust fans trying to make Rust into a type dependent language while abusing unsafe's original purpose.


As mentioned in the article, it's entirely possible to create a raw pointer to an object in safe rust, you just can't do much with it. One thing you can do with it though is convert it to an integer and reveal a randomized base address, which isn't possible in some other languages without their "unsafe" features. Of course, this follows naturally from Rust's definition of what is safe, but I remember being kind of surprised by it when I first learned Rust and didn't understand those definitions yet.

It would be pretty interesting to me if someone wrote a survey of what different languages consider to be "unsafe", including specific operations like this. For example, it looks like "sizeof" is relegated to the "unsafe" package in Go, which strikes me as strange.[1] I'd love to read a big comparison.

[1]: https://pkg.go.dev/unsafe#Sizeof


I am by no means a Go expert, but I think the `unsafe` package is the only standard library package in Go that does not conform to the stability & portability guarantees and `Sizeof` is neither portable nor stable.


At the top of the page it says "Package unsafe contains operations that step around the type safety of Go programs." though


You can also compare & hash it. I once used it, to store visited objects in a hashset without the borrow checker making me sad.


This article fails to mention that the true semantics of Rust's `unsafe` subset has been very much up-in-the-air for a long time. Nowadays the `miri` interpreter is supposed to be giving us a workable model for what `unsafe` code is or is not going to trigger UB but many things are still uncertain, sometimes intentionally so as some properties may depend on idiosyncratic workings of the underlying OS, platform and/or hardware. These factors all make it harder to turn what's currently `unsafe` into future features of safe Rust, perhaps guarded by some sort of formalized proof-carrying code.


I could be wrong, but my impression from posts like this one https://www.ralfj.de/blog/2020/12/14/provenance.html is that even C doesn't have a fully coherent model for what happens when we start doing weird things with pointers. Apparently there are LLVM bugs open that ultimately need to be addressed by the standards committee?


Progress towards this is always being made, though it's a long road with much still to determine. I encourage anyone interested in this to join the Zulip channel for Rust's unsafe code guidelines working group: https://rust-lang.zulipchat.com/#narrow/stream/t-lang.2Fwg-u...

Some recent developments: https://gankra.github.io/blah/tower-of-weakenings/


ESPOL, NEWP, Mesa, Cedar, Modula-2, Modula-2+, Modula-3, Oberon, Oberon-2, Component Pascal, Active Oberon, D, C#, F#, VB, Ada, Go, Swift, just a few examples.


Examples of what though?

Is it your expert opinion that Wirth foresaw the problems we have writing this bit-banging code on modern hardware and solved them correctly despite never formalising what they actually are ?

Lists of languages that don't even try to do what Rust is doing here suggest you haven't thought very hard about even what the problem is. Of course if you aren't sure what the problem is it may be trivial to convince yourself that you've "solved" it.


Examples of having unsafe code blocks to express memory corruption hazards.

Rust is following Ada's footsteps for a newer generation, while still not having something comparable to SPARK.

So what it is Rust trying here?


Go has Undefined Behaviour for data races under concurrency. So, while the unsafe package is one obvious way to do dangerous stuff and experience "memory corruption hazards" the rest of the language isn't actually safe.

C# is subject to the whims of the host CLR for both type safety and memory model, and as a result is inherently unsafe on CLRs that actually exist (primarily, Microsoft's) because dangerous was much faster.

They're both dramatically better than C++ and so give us an opportunity to assess Herb's claim that if C++ was 90% safer nobody would bother him about how ludicrously dangerous it is. It does seem like people are comfortable in C# and Go as a result. But, once in a while something very strange happens and it gives these developers no comfort to learn that the Undefined Behaviour was foreseen and they get a WONTFIX back from the language's implementers (e.g. C# can't conceive of how two booleans can both be true yet not equal, however the CLR does not guarantee what C# wants here so C# loses).

So that's what Rust is for. Mostly though, Rust is just very nice to use. I don't actually need Aria's Tower of Weakenings, most of the time I don't need even need unsafe, but I'd still rather be in Rust.


Except those are all examples Rust unsafe isn't designed for, other than by those that misuse its original purpose, even The Rustonomicon advises against it.

As for data races, great that it fixes data races among threads, yet it does nothing against shared resources in distributed computing, which are what really matter in the age of microservices and protection against spectre like attacks.


> even The Rustonomicon advises against it.

What is "it" here which you believe has been advised against? Please try to be specific.

> As for data races, great that it fixes data races among threads

Yes, it is pretty good. It was pretty great the last few times you "forgot" about this too.


Taking from the Rust book,

"In addition, unsafe does not mean the code inside the block is necessarily dangerous or that it will definitely have memory safety problems: the intent is that as the programmer, you’ll ensure the code inside an unsafe block will access memory in a valid way."

Although it appears I stand corrected in regards to Rustonomicon, as it seems to adopt the position that unsafe should apply to more than memory corruption bugs and failed to find the sentence I was looking for.

> Yes, it is pretty good. It was pretty great the last few times you "forgot" about this too.

I never forget about it, from my point of view they are oversold, and will keep arguing they aren't enough for distributed computing deployment scenarios with heavy use of acess to process external shared resources.

Fearless concurrecy isn't something I care about when accessing database rows from multiple threads across database connections, writing to NUMA regions across process and so forth, shared storage,....

Races are still bound to happen in such scenarios, if care is not taken, and fearless concurrency is of little help in such cases.


> Safety, in Rust, is very well-defined; we think about it a lot. In essence, safe Rust programs cannot:

> - Dereference a pointer that does not point to the type the compiler thinks it points to. [...]

> - Cause there to be either multiple mutable references or both mutable and immutable references to the same data at the same time. [...]

Sometimes, I wish there was a systems language that let you exactly do that. Dereference any memory location and see what memory is there. Use a cast to reinterpret the bits. If the memory address is unmapped, then maybe just throw an exception. Allow multiple threads to fill in different spots in a huge array at the same time. Basically write C as high level assembler, like you did in the 90s.

I know -fno-strict-aliasing and some other flags can get you mostly there, but you are still violating the spec and it is at best tolerated. I also know that you loose some optimization potential, but it also reduces UB and that is a trade off you should be able to choose.


You're effectively making every variable volatile if you do that, and you lose nearly all optimization potential.


Sometimes unsafe is used differently, for example when setting up a signal handler or a post-fork handler. Process::pre_exec is unsafe to indicate that some safe code may crash or produce UB within this function. Only async-signal safe functions should be used and Rust does not model that in its type system.


Programming is inherently unsafe. Rust only solves a fraction of memory safety, and there are more unsafe things beyond memory concerns. It’s a complete misnomer.

“Safety, in Rust, is very well-defined; we think about it a lot.”

You can’t just construct some elaborate static analysis complex and call your language “safe” as a result, that’s a surefire way to make your language impenetrable to anyone outside of your field of expertise.


Interestingly enough, unsafe is the root reason Rust couldn't add Vale's Seamless Concurrency feature [0] which is basically a way to add a "parallel" loop that can access any existing data, without refactoring it or any existing code.

If Rust didn't have unsafe, it could have that feature. Instead, the Rust compiler assumes a lot of data is !Sync, such as anything that might indirectly contain a trait object (unless explicitly given + Sync) which might contain a Cell or RefCell, both of which are only possible because of unsafe.

Without those, without unsafe, shared borrow references would be true immutable borrow references, and Sync would go away, and we could have Seamless Concurrency in Rust.

I often wonder what else would emerge in an unsafe-less Rust!

Still, Given Rust's priorities (low level development) and the borrow checker's occasional need for workarounds, and the sheer usefulness of shared mutability, it was a wise decision for Rust to include unsafe.

[0] https://verdagon.dev/blog/seamless-fearless-structured-concu...


> Without those, without unsafe, shared borrow references would be true immutable borrow

Rust devs have thought about implementing "true immutable" before and found it to be problematic. It would come in quite handy for mostly anything related to FP or referential transparency/purity, but these things turn out to be very hard to reconcile with the "systems" orientation of Rust. Perhaps the answer will reside in some expanded notion of "equality" of values and objects, which might allow for trivial variation while verifying that the code you write respects the same notion of "equality".


I'm not sure I follow this. If Rust wanted the feature in that blog post, they could restrict it to only accessing data that is Sync. They wouldn't have to throw out the concept of Sync entirely, in fact cases like this are the reason it exists. Rust just chooses to leave this kind of feature up to libraries instead of building it into the language.

And even without unsafe, you still couldn't assume all data is Sync. Counter-examples include references to data in thread-local storage, and most data used with ffi.


Rust without unsafe would just be another obscure academic language like Haskell. The ability to bypass the type system when the programmer needs to is what makes Rust work.



Haskell also allows you to bypass the type system plenty


> In a certain sense, this means that everything written in Python is memory unsafe.

CPython is also less aggressive about adopting mitigation strategies compared to Rust, although it has improved a lot in the last few years.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: