I would argue that the phrase "One of Rust's main differentiator[s] is that it provides memory safety" should really be "One of Rust's main differentiators is that it provides memory safety without using a garbage collector". Which is the actual reason for needing (amongst others) ownership rules...
It also protects against data races which is fairly novel. A lot of functional languages offer this by avoiding mutable data, but that's high cost and not really equal with what rust does.
I find rust quite elegant but also quite complex. There's a scale for computer languages between a compiler that doesn't do enough checking (JavaScript, Python), and ones that do too much where you spend a lot of energy trying to satisfy the compiler. I think rust falls a bit too far to this latter side for my taste. I like the language but I feel more productive with Go even although Go isn't as elegant in my mind and has some flaws (nil, no generics).
I think rust is nicer for where you would previously use C or C++ but I don't work in that domain.
I agree with everything you posted (and I also really like Rust!), but I especially took note of this:
> A lot of functional languages offer this by avoiding mutable data, but that's high cost
This is true, but the inverse is that Rust's ownership model itself poses its own high cost in terms of a significant cognitive burden (and while I have no doubt that the burden decreases with time, I'm very skeptical that it ever gets to the point of a GC language). This is roughly what you're saying in the second paragraph, but I wanted to be a bit more explicit that the ownership model itself comes with a cost.
> significant cognitive burden (...) I'm very skeptical that it ever gets to the point of a GC language
My experience in this regard: Once you stop thinking in OOP, and embrace Data Oriented Design (which really helps structure your program like Rust wants it to be to avoid lifetimes issues) and get proficient with type system and popular libraries, Rust is far more productive and effortless than mainstream GC languages. I routinely do non-trivial changes to my open source projects late in the evening, after long day of work, after I've finally put my kids to sleep, and despite sloppiness and lethargy it's very much effortless.
I don't have to debug weird stuff. Compiler just straight tells me what I did wrong and how to fix it. Languages expressiveness is first class, so I don't have think too much what's the best way to express this or that idea. Then - not only I don't have to worry about `free`, but I also get deterministic destruction: files, sockets and any other "resources" get closed/freed when I'm done with them. And then the libraries... they have beautiful, well-documented, and hard to misuse APIs.
I'm already on-board with data-oriented design, and my pedigree includes lots of such projects and languages (C, C++, Go, etc). I'm also familiar with Rust; I've been dabbling in it since ~2014. Still, I come to opposite conclusions.
Rust is a magnificent language for many things, but I'm manifold more productive in Go or Python, even in concurrent contexts. Notably, the ownership rules guarantee correctness, but they also prohibit all sorts of correct programs (e.g., functions that only run in a single threaded context or whose shared data doesn't actually mutate). I would say I have the hang of these rules, but I still struggle to make sense of the lifetime error messages (and I find this nearly impossible when I'm tired), contrary to your experience about the clarity of the compiler messaging. I don't mean to slight the compiler--communicating about lifetimes is fundamentally hard; however, the point is that these error messages must be dealt with in all code even though the overwhelming majority of my code isn't subject to race conditions in the first place.
I really want to like Rust (I love it in theory); however, frankly I don't see how Rust can approach a GC language with respect for productivity for anyone unless the domain prioritizes correctness and/or performance above all else.
> I don't see how Rust can approach a GC language with respect for productivity for anyone
My domain is games, and while this usually prioritizes performance that's not really what I care about (so I could be using Python for instance and it would be 'ok'). My take is this: When it comes to productivity even if writing the same code is easier in Python than Rust, working over time in a Python project (or any dynamic language really) is just a nightmare for me. No matter how many tests I write or how I structure my code when I have to add a new feature that touches large parts of the existing system I just have 0 confidence it is correct. I have to run my code through all the branches to make sure I didn't forget to, idk, add the new parameter to all function calls or something.
Rust on the hand is just reliable. I can start updating my code with a half baked idea, start adding types to things, removing parameters from functions, changing variables names and I KNOW nothing will sneak on me during runtime. This is more of a feature of strong type systems than Rust itself, but still this is what productivity means to me.
(Also, to me lifetimes were the hardest concept to grasp, but I think once you really get what they mean that's when the language clicks and you get the same productivity you would elsewhere)
Fully agree that (all else equal) static type systems are more productive than dynamic type systems; however, that's mostly orthogonal to GC vs Rust's ownership model. I agree that Rust's ownership model provides for more correctness than GC, but at the expense of a whole lot of productivity (which was my original point). I also think the returns on static type systems (and static analysis in general) diminish quickly after (roughly) Go's position on the axis. To be clear, I would like to see Go have a more expressive type system (generics and sum types), but I think Go recoups upwards of ~90% of the quality difference between untyped Python and Rust (for example) while still allowing you to be at least as productive as with Python (which is still about 90% more productive than with Rust IMO). Disclaimers: this is all super subjective, error margins are huge, and YMMV.
I would take Go over Python, any day. It is productive, especially if your program is a relatively typical network service: get request, poke here and there, transform this into that, return a result. It looses steam when the code is abstract logic or generics/macros heavy. Also, I personally loose quite a bit of time on debugging these couple of pesky issues that I always get introduced by `nil`, lack of sum types, some Go gotchas etc.
In practice the most important things for general productivity are always: what is your team familiar with, what are you familiar with, and are the types of libraries that you need available. Quite often these are skewed in Go's favor.
Go, Python, and C++ are all notably languages that encourage a traditional imperative model of programming with things like for-loops and lots of mutation. I've noticed that developers with a background in these kind of languages struggle with Rust, which works much better with a functional approach.
Coming from JavaScript, my Rust code is often almost identical to my JS code. With perhaps a few extra `.clone()`s, a bit of extra error handling code (which is of course the benefit!), and type annotations.
I’m perfectly comfortable with a “functional” as it pertains to Rust. In particular, Python (and to a lesser and more recent extent, C++) have had a lot of functional features (iterators, first-class-function, etc) for quite a while. The lack of GC actually makes it harder to use closures in Rust than Python or even Go.
It really is downright shocking sometimes how much gets caught by the rust compiler. It's the first time in my life when I feel like code is likely to actually work once the compiler comes back clean.
I think how low it gets really depends on the person. For me, it's basically zero burden. I don't have to think about these things, the compiler thinks about them! That's kind of the point. I've also internalized the rules, so that I tend to program in a style that works. I am of course, extremely biased, but this story seems to play out with many people who stick with it.
> This is true, but the inverse is that Rust's ownership model itself poses its own high cost in terms of a significant cognitive burden
In "burdensome" situations, you can always defer ownership checking to runtime with the Rc<RefCell<T>> and Arc<Mutex<T>> patterns. It's less performant than using pure compile-time checks, but not nearly as problematic as general GC, or immutable-only data structures.
> I'm very skeptical that it ever gets to the point of a GC language
For me, it does impose a (small) cost over a GC language. But other aspects of the language (enums, traits, type inference, etc) are so much better than most mainstream languages that it more than makes up for this. I write JavaScript quicker than I write Rust, but I write Rust quicker than I write Java.
Yes, I agree. Although one must be careful taking about high performance cost like immutable data structures and high cognitive cost like the borrow rules in rust.
I think I prefer a GC most of the time, and when that's not true I'm fine if the language offers me an escape hatch (unsafe in Go, paired with a native allocator like jemalloc.)
It really does depend on the problem domain. Go’s type system is probably easier to use for a web app backend, while Rust makes lower-level code easier and stuff involving linear algebra or anything else where a sophisticated type system is useful far less wordy and more safe.
I would never write a ray tracer in Go, although it could be done.
It is interesting that rust goes for a demographic of people coming from scripting languages or functional languages, but it seems like its niche might be much more suited to the lowest level of larger programs or embedded work where people are using straight C.
Straight C seems to be used for transparency and compiler compatibility though, which could be at odds with rust's current workflow.
I think pragmatically using rust, modern C++ or even C, any concurrency needs to be done at a generic and separate architectural level that is not part of the day to day refinements and features of a program. I don't think constantly thinking about data races, low level synchronization, or even lifetimes more complex than being returned from the creation scope is ever going to be a recipe for long term success.
This is all to say that I think rust's safety strengths are greatest in areas of a program that should actually be as small as possible and changed rarely.
This is partially a thing that happened organically. People who don’t come from a low-level background hear “compiler checks your work!” and think “I need a lot of help, that sounds great!”. People who come from a C or C++ background hear it and think “great, something that will get in my way”, and also sometimes think that they know better when they actually don’t. It used to be pretty common for people to jump on IRC and say “ugh I know I’m right how can I get the borrow checker to shut up” and then have something they were missing pointed out to them. The compiler was right, in the end. (This is of course not true all the time but it is a lot of the time...)
Honestly, coming to Rust from C/C++ is/was a breath of fresh air. I work mostly in C++ for my day job, and with Rust I spend so much less time thinking about the implementation because the compiler will check for me. With C++ I'm always scratching my head about whether I've remembered to make all the constructors properly, whether I'm initializing this variable properly, whether I'm invoking the STL correctly to get a shared_pointer or something. Whether this piece of STL is compatible with some other piece of C++. This list goes on and on. With Rust, the compiler will tell me when I forget things, through the strong type system and borrow checker. It is insane the number of times the code works once it compiles. With Rust, after getting over the hump with lifetimes, I spend so much less time thinking about syntax and language specifics, and so much more time thinking about the business logic of the problem. The language acts like a great belay system to keep me much safer.
I never run into this myself in modern C++ and I think it is about what I don't do. This might be similar on some level to the fact that rust constrains programmers purposely. I think both cases are indicative of better programming through avoiding things that are hard to get right and have little to no upside.
I almost never use shared_ptr, since I know the lifetimes of the memory I use. I know the lifetimes because I keep almost all actual memory management within data structures made of mostly STL data structures. The custom data structures don't need destructors because vector<> will take care of it etc. I never use inheritance and never use polymorphism. The core logic then just uses these compound data structures made from STL containers and the lifetimes are frequently trivial.
It is interesting that rust goes for a demographic of people coming from scripting languages or functional languages, but it seems like its niche might be much more suited to the lowest level of larger programs or embedded work where people are using straight C.
Yes. Originally Rust seemed to be the answer to the three big questions in C:
"How big is it", "Who owns it", and "Who locks it". Those are the cause of most of the vulnerabilities you see in CERT advisories. Rust finally had a working, efficient solution to all those problems. A huge breakthrough.
Then the type-theory people, the "metaprogramming" people, and the functional people got in there and mucked up a perfectly good imperative language.
They turned writing code into either puzzle-solving or copying forms you don't really understand.
That already happened to C++. The "metaprogramming" people complicated templates to the point that "you are not supposed to understand this" is now normal for some standard templates. Now the C++ crowd is trying to emulate Rust features like move semantics without a borrow checker, which doesn't really work.
Go was a reaction to that. Go is a mediocre language, but it's good enough for most web back-end stuff, which is why Google developed it.
I agree, but they are better than nothing and since millions of C and C++ lines of code aren't going to be converted into something else, any improvement is welcomed.
For example, I was quite surprised that with all the security sales story, Azure Sphere SDK is C only, which basically torpedoes their sales pitch.
At least MSR seems to keep having a go at Checked C.
That would also be a bit confusing because it can be interpreted (incorrectly) as that all languages with garbage collectors offers all the safety guarantees of Rust.
A trivial counterexample is the protection against data races.
Again, that is provided in higher level languages by abstractions over mutable state, such as STM (example: Clojure's Refs and Atoms [0]). And/or by making immutability the default.
Neither that, memory safety, nor forcing you to deal with optionality is unique to Rust.
The big difference is that Rust manages to do these things while also having similar performance characteristics as C/C++. The trade-off is that the programmer has to explicitly deal with these issues.
I personally haven't yet seen a safety language feature of Rust that is not somehow available in other languages with similar claims. I would appreciate learning otherwise if I'm missing something.
I think the only feature that would push Rust to being at the forefront would be "const generics" [1] or true dependent types down the line.
Does Clojure actually force you to use those abstractions though? Many languages provide concurrency primitives, but Rust will fail to compile if you try to use non-concurrency safe objects from multiple threads.
In short: it is inconvenient to use mutable state and unidiomatic to not use the STM.
The long version is that you could use 'set!' with Vars. But this is very much an avoided practice. Another caveat is that Clojure is hosted (primarily JVM and JS), so any code you call via interop has the characteristics of the host target.
But idiomatic Clojure uses at least Atoms or Refs and even those are discouraged for anything else than storing state, all the algorithmic (transformation, branching and so on) code follows the FP paradigm.
A Rust analogy is the 'unsafe' escape hatch. You can use it but it would be unidiomatic, inconvenient and in most cases unnecessary.
That comment is somewhat disingenuous. It is like saying that Rust is an unsafe language, because of 'unsafe' or the mere fact that you can just use stringly types for everything.
Both using 'unsafe' and using String instead of an Enum would compile without runtime guarantees. But such code would stick out.
Right. The main advantage of Rust is that its thread safety is opt-out (unsafe) rather than opt-in (remember to use the right data structures / locks).
This is sort of a question, but doesn’t Rust refuse to compile even when you use non-concurrency safe objects in a single thread?
I think that is the case
Rust types have markers (Send, Sync) that tell the compiler whether they can be moved across threads, and whether they can be simultaneously accessed across threads.
These markers are irrelevant in a single-threaded scenario.
It's not purely about those markers. Many people also believe that the only issue with aliasing is when threads come into play, but that's not the case. Rust also prevents mutable aliasing even in non-threaded cases for this reason.
Is there any explanation of this? Specifically i would like to understand how a single mutable reference is required for correctness in a single thread scenario.
I’ve googled everywhere but I can’t find a single source of information that completely explains this, possibly showing realistic source code and elaborating the way it could generate memory errors or be miscompiled in case of multiple mutable references.
Agreed, the novelty is doing it in the core language (unsafety is opt-in, not the other way around) and without sacrificing performance.
Any language that relies on GC and avoids mutation e.g. by extrmely inefficient data structures (persistent data structures and other immutable data) obviously achieve mostly the same level of safety but at a huge cost.
> Agreed, the novelty is doing it in the core language (unsafety is opt-in, not the other way around) and without sacrificing performance.
Unsafe code blocks as opt-in in systems languages appeared for the first time in 1961, followed by multiple variations of it since then, far from being a novelty by now.
No Rust is not “just as unsafe”, it keeps all its unsafety inside specific blocks where the burden to avoid (or handle bad behavior from) e.g. races, type safety or memory management is placed on the developer. In my C# programs I have to handle the risk of races with my head, in the entire codebase.
One might argue that “well those small parts of a rust program are as unsafe as the entire C# program” but that would be a completely nonsensical argument so I’m going to give you the benefit of the doubt that’s not what you meant.
Only because apparently you don't make proper use of Dataflow and TPL libraries.
Besides, the recent security exploits have proven that it is time to go back into multi-processes, and here Rust doesn't have anything to offer.
So one should tune a bit down the tone of labelling all the other languages as unsafe, even managed ones, industry standards like SPARK, or system research languages like ATS.
What would you use where correctness the top priority? Haskell? For the best thing about Rust isn't the performance (although that's great). It's just how reliable my Rust projects have been in production. If you wanted this level of reliability from another language, you'd have to write tests to cover all the possible failure cases. Which is much more time consuming than having the compiler point out all the possible failure points for you.
If correctness is the top priority, then formal verification methods and/or SPARK and/or specialized subsets of C with the tooling that comes with them for safety projects.
Rust and Haskell and standard C and C++ are all non-starters.
If by "correctness" you mean what I call "small-c correctness" ("this reference won't be null", "this function call doesn't write to the database") then Haskell is currently best-in-class. If by correctness you mean what I call "big-V Verification" then the sibling answer[1] is better for you.
One example that comes to mind is type state [0]. I'm not aware of any other language (including those with support for dependent types, like Agda) where these properties can be checked without external static analysis tools and the problems they bring.
False, because Rust does not offer protection against race conditions.
Do not confuse data races (a subset of race conditions that many languages solve) with race conditions (a way harder problem that there is no general solution to).
Yes, as I do more rust, I love the fact I can safely parallelise more than I love the memory protection.
Before Rust, I was close to saying that thread-based parallelization was almost impossible to do safely, and was moving to only doing process-level parallelization.
This was actually the reason why Graydon Hoare created Rust BTW. He wasn't looking for a way to get rid of garbage collection or an alternative to C++ (those two characteristics came later from Mozilla, the first versions of Rust had a GC and a heavy runtime with green threads and all), but he wanted a language that helped using all cores on the machine without the traditional issues of shared-memory multithreading.
Yeah, I guess they mean differentiators against similar languages (mostly C/C++). There are few languages with "no runtime" that provide memory safety.
As someone who hasn't done much c++ in over a decade, but who loves rust, I often find the extreme focus on comparing to c++ to be a bit jarring and unhelpful.
As someone who's worked more with JS/ts/java, I find the borrowing system's guarantees far more interesting wrt concurrency. Yet that's often little more than a footnote in long articles about how rust is safer than c++.
Which guarantees? Rust guarantees there are no data races (like other higher level languages). It does not guarantee anything regarding general race conditions (no language can, at least in a non-trivial way).
Rust gets compared with C++ because that is the target audience. If you don’t need the properties of a systems programming language, you are better off avoiding both C++ and Rust for increased productivity.
It prevents you from sharing non-threadsafe structures between threads. A language like Java provides a concurrent hashmap, but nothing stops you from (incorrectly) using the standard hashmap. Rust will fail to compile if you share a std::HashMap across threads without wrapping it in a mutex or similar synchronisation primitive.
One thing that still confuses me about Rust and something I'm not able to brush aside is the fact that as far as I can tell I can't ever be certain when Rust will decide that it needs to copy the underlying data or merely pass a pointer. I don't know the language beyond reading a few chapters of the Rust book but this is something I'm stuck on and I won't invest time in learning the language that removes from my control when I pass data by reference versus pass it by value. The cost of the latter is simply too great in certain scenarios.
Can someone share a good, authoritative source that explains the variable passing semantics of Rust in terms that a C/C++/Java programmer can understand?
On one level, Rust has no defined ABI, so on some level, the answer to this question is "no way to tell."
On another level, Rust doesn't have pass by reference. Like C, everything is pass by value, and pointers are themselves values, sometimes called "pass by value by reference." So semantically, if you pass a value, a value will be passed, and if you pass a reference, a reference will be passed. You have control here.
On a third level, like C and C++, Rust may optimize this in whatever way it wants. Your function may get inlined, and there's no passing whatsoever. Large values may be allocated on the parent's stack frame and a pointer may be passed instead.
I don't understand where your confusion is coming from. If and argument is a borrow, it is being passed by reference. If its ownership is passed, the type is passed by value, and a copy might happen if needed or the same portion of memory will be used if the compiler can avoid the copy.
You seem to think you have less control over your allocations than in other languages, but this is not accurate, particularly contrasted to GCd languages.
The idea is to detect large moves and do a pass by reference, much like Swift does. On my personal roadmap I have detecting these cases and lint against it, suggesting passing the argument as & or &mut as appropriate. That way the performance implications and tradeoffs are visible in the code you write/read.
Rust doesn't provide memory safety in the GC sense — it can and will leak memory in ways Java wouldn't. Rather, it provides safety from all data races, which is something that conventional GCs don't give you.
Java doesn’t prevent memory leaks either. Java will happily allocate up until an OOM. If an object is live, it will not be garbage collected.
In fact Java will leak memory over FFI boundaries as well. Rust is memory safe, but like many languages it is not leak safe. It is possible in Rust to tell it to leak memory, but that’s for special cases.
To state this another way, memory leaks are different from memory safety.
"Memory Safety" doesn't refer to leakage, it refers to read-after-free, buffer overflows, and other memory-access related bugs. Garbage collection runtimes keep track of memory access to avoid these issues, rust does it through lifetimes/ownership.
How does java prevent memory leaks which rust couldn't ? Do you have examples for that? I was not aware that Java does it and am very interested in that.
One of the advantages of a tracing garbage collector is that it can recognize cycles. This means that if you have a few objects which point to each other, but the overall object graph is dead, a tracing GC should be able to collect these objects.
If you use reference counting in Rust, it will not be able to detect cycles. That said, it's not super easy to get a cycle accidentally.
True, but it's safe with memory in ways that non-GC'd languages usually aren't (buffer overflows, use after free, etc.) so it's worth mentioning the GC bit.
I am a long time C++ user and have looked into Rust a bit but not used it in anger.
One thing not mentioned in the section about ownership move: unlike C++ where you can write arbitrary code in the move constructor, in Rust the footprint of the object is copied bitwise and there is no scope to customise this at all. If you have a class where this doesn't work (e.g. because it contains pointers into its own footprint) then you either need to refactor your class for Rust (e.g. change those pointers to offsets) or disable the move trait entirely and provide a utility function for move-like creation.
This is a trade off: as a class writer you get less flexibility, but as a class user you get much more predictable behaviour. And it's only possible because of the way that Rust just forgets about the original object (i.e. won't call its destructor at the point it would have done otherwise) at the language level. If it didn't, as in C++, you need some way to stop the original object freeing resources needed by the new object.
IMHO, move semantics and rvalue references in C++ are amongst the most confusing and worst designed parts of the language, so this is one of the most important benefits of Rust even before you get to the reference lifetime stuff.
Yeah the lack of self-referential structs can be very annoying at times and there’s no great solution to it at this point (disclaimer: haven’t tried rental properly yet as it looked complex at first glance).
On the flip side I am very happy to trade that papercut for the simplicity offered by Rust’s move semantics compared to the minefield of c++.
> disable the move trait entirely and provide a utility function for move-like creation.
There's no way to disable moving in general. But there is the Pin trait, which prevents you from moving certain types in certain cases, mostly to do with async/await.
The Pin type is a mess IMHO. It's the most C++-like corner of Rust. Its guarantees are shallow, so non-trivial uses are still unsafe. It ended up having complex interactions with other language features, creating a soundness hole. I'd rather bury it as a necessary evil that was required to ship async/await syntax, and nothing more.
I don't want to be defensive but this is completely off base, and very disheartening to see you post. Allowing users to write safe self-referential code was never a design goal of Pin - only to make it safe to manipulate self-referential objects generated by the compiler, which it has succeeded at. The incredibly overblown soundness hole had far more to do with the quirks of #[fundamental] than Pin (and the soundness hole is being resolved without changing the Pin APIs at all or breaking any user code).
The guarantees of Pin are now even being used beyond self-referential types but also for intrusive data structures in tokio's concurrency primitives. Pin is not for everyday users, but it is one of the greatest success stories of Rust's unsafe system. That you would call it a mess to a forum of non-users is quite disspiriting.
Sorry, I should have phrased that in a kinder way.
The fact that Pin is expected to enable more than it does is part of the problem. It's not for usual self-referential structs. It's not for usual umovable types. It has 2000 words of documentation, and I'm still not sure what to do with it. I need to understand pin/unpin, structural/non-structural projections, and drop guarantees. It has a lot of "you can do this, but then you can't do that" rules that the compiler can't help with.
For me most Rust features are either easy to understand (like UnsafeCell or MaybeUninit), or I can rely on the compiler make me use them correctly (like borrowing or generics). But with Pin my impression is that it pushes the limits of what can be done without compiler's safety net.
It's certainly tricky to think about, when I start reading all the docs. (Someone just recently explained what "Pin projection" is to me, and I realized I hadn't really understood Pin before that.) But at the same time, it seems to mostly accomplish its goal of "you don't have to think about this in safe code." Is there a better way we could've solved the same problems?
I kind of feel the same way about Send and Sync. Like, why are there two thread safety traits? What could it possibly mean to be Sync but not Send? But at the end of the day, that's what's necessary to model the problem, and I've gotten used to it.
Really? I have almost the exact opposite opinion of it. If anything Pin validated the construction of Rust the language and it's approach to structuring types in the standard library. We got to express something at the type level without implementing it as some kind of custom language feature. It's complicated, but it's describing a complicated problem so the complexity is inherent. And luckily, most users will never have to interact with Pin anyway.
Thanks for the clarification! I suppose there are two interpretations of "move" in general: (1) the object is now owned by a different variable, potentially with a different lifetime; (2) the footprint of the object is now in a different location in memory. It seems that std::pin refers to the second one:
> A Pin<P> ensures that the pointee of any pointer type P has a stable location in memory, meaning it cannot be moved elsewhere and its memory cannot be deallocated until it gets dropped. We say that the pointee is "pinned".
That is actually the definition of "move" that I had meant, after all that is what is bad for an object that points into its own footprint. But I realise that "move" in Rust normally means the first one (but note that even Rust's own documentation uses "move" the other way in that snippet above!).
Your post suggests a bit of confusion about the meaning of "lifetime" in Rust - a confusion which is common and why we are somewhat unhappy with how that terminology has played out.
Variables don't have a "lifetime," references do. When we talk about lifetimes we talk about the lifetime of references to variables, during which time the variable cannot be moved, dropped, etc. The "lifetime of the variable must be greater than the lifetime of the reference" is a common mental model but this lifetime of the variable doesn't really come into play in reality.
Your 1) and 2) always coincide in the abstract model, though the compiler may optimize out memcpys that have no impact. When you move a pointer around, you don't move the object it points to.
This is sort of the problem: its very natural to refer to that as a lifetime! In fact, more natural than the lifetime of a reference, because we intuitively think of variables as being "alive" and references as being short-term "views" of the variables.
As in the sibling comment, "scope" could be used, but indeed maybe we should have just called lifetimes "scopes" or something (though they are not lexical, whereas scopes are usually thought of as lexical pyramids).
I probably would just say lifetime usually, but I would be being imprecise and potentially unclear!
This touches on some interesting history: lifetimes were originally (in other PL literature) called "regions" (MS Research's Verona also uses this terminology). Lifetime was chosen because an analogy to time seemed more intuitive for this concept than an analogy to space - space analogies being usually used for sections of program memory, rather than periods of program execution.
> If you have a class where this doesn't work (e.g. because it contains pointers into its own footprint) then you either need to refactor your class for Rust (e.g. change those pointers to offsets) or disable the move trait entirely and provide a utility function for move-like creation.
There's an unofficial 'transfer' crate that does provide this behavior if needed, fwiw.
This can happen if you have one class that contains a pointer or reference to another class, and then you create a helper class that instantiates both of them as direct member variables.
Edit: Another example: Imagine you have a matrix class that contains a bit of memory as a char[] member to avoid heap allocation for small matrices. Then you maybe you want your data pointer to point directly at that data (rather than looking at a separate flag, which might cause an extra branch or just be more code complexity). I'm not saying it's a good idea, just it's obvious that it might come up.
I read somewhere that immutable and mutable borrows are perhaps alternatively understood as exclusive and shared borrows.
This means, if you have a &mut then you have an exclusive borrow. No other borrows are possible as long as your &mut lives (i. e. did not yet go out of scope). Usually this exclusive borrow is used for mutating values.
This dichotomy in terminology is shown when you learn about interior mutability. Here you can mutate values without having an exclusive borrow, an example is RefCell. The price you pay for this convenience is run-time checking of access. RefCell disallows mutation if some other location in your code already has asked for mutable access.
I read somewhere that immutable and mutable borrows are perhaps alternatively understood as exclusive and shared borrows.
Yes, when it comes to exterior mutability, it's basically single mutable xor multiple immutable.
The price you pay for this convenience is run-time checking of access.
The nice thing is that RefCell is not magic, it's Rust all the way down. E.g. the status of the borrow is updated by the destructors (Drop) of the reference types. All administration is done using a signed integer to do reference counting. The value 0 means 'no borrows', any positive number indicates the number of immutable borrows, -1 means one mutable borrow.
It's well worth reading the implementation of RefCell some time!
I'd like to point out that `RefCell` does contain a bit of magic, since it is based on `UnsafeCell`, which is _the_ core "primitive" of Rust that enables interior mutability:
So basically if Rust's stdlib didn't provide it, you could reimplement it yourself from scratch.
EDIT: actually, reading the comments, it looks like I'm wrong about that:
> If you have a reference `&SomeStruct`, then normally in Rust all fields of `SomeStruct` are immutable. The compiler makes optimizations based on the knowledge that `&T` is not mutably aliased or mutated, and that `&mut T` is unique. `UnsafeCell<T>` is the only core language feature to work around the restriction that `&T` may not be mutated.
This annotation let's the compiler treat their items in special ways, but it is usually for either easier way to refer to them (I want to desugar this type to a cell of what I already have) or for diagnostics. I think box is one of the few "magical" things in the language.
Sorry, I should have been clearer about that. You are completely right that at some point things have to rely on some primitives that are 'magic' (in this case UnsafeCell). But I meant that RefCell itself uses standard constructs to guarantee the 'single mutable xor multiple immutable' invariant.
This line in the Rust book is what made it click for me: "Rust can figure out definitively whether the method is reading (&self), mutating (&mut self), or consuming (self)."
This was from the methods chapter but I think it applies globally. You either borrow to read, borrow to mutate, or consume (and possibly return a different value back)
I'm sure there will be more nuance but I think this is a good conceptual foundation to begin with.
In Rust, this is how “ownership” works. Once a variable is moved to a new scope, ownership moves with it, which means that it cannot be used in the current scope anymore.
If I consume your apple then you no longer have your apple and the apple will be gone once I finish with it. But because I eat apples whole, I can spit it out before I swallow and return it to you.
The analogy may not be perfect but the point is complete control of the object has been transferred to the function and so you no longer have access to it and it will be destroyed once the function ends, unless it's returned.
I always liked to think of it in terms of car ownership. You can immutably borrow my car (&car) but if you modify it, you're a jerk (you obviously can't modify it in Rust). Or if you're a mechanic you can mutably borrow (&mut car) it and upgrade it or something. Orr, if I sell my car to you, you've taken ownership of it.
Rust's terminology is kind of reversed: It is stated that a mutable reference cannot be shared, while in fact a reference is called "mutable" when it cannot be shared.
The terminology is understandable and great to get started, but once you dive in deeper into atomics and internal mutability, it's rather crucial to understand this reversal.
I suppose you can think about it as having a shared borrow to the ref cell, but using it to gain an exclusive borrow to the data underneath, to fit the intuition.
What I like most about Rust is that I can pass references around without worrying whether the value they reference will continue to exist. This has been a constant mental burden for me in C++ and has led to some unnecessary defensive copying.
I'm not so sure whether Rust's strong immutability/exclusivity guarantees are worth the trouble though. Unexpected mutation hasn't been a major source of bugs or mental burden for me in other languages, at least not in a single threaded context.
> I'm not so sure whether Rust's strong immutability/exclusivity guarantees are worth the trouble though. Unexpected mutation hasn't been a major source of bugs or mental burden for me in other languages, at least not in a single threaded context.
For the multi threaded context it's amazing. Compile-time data-race checking is such a unique and useful feature. That and the great ergonomics of the locking primitives makes multi-threaded programming in rust really enjoyable.
I think this is a really underrated benefit of the ownership model. You cannot forget to acquire a lock, and you cannot keep a reference to a lock's contents past the point where you've released it.
> Unexpected mutation hasn't been a major source of bugs or mental burden for me in other languages, at least not in a single threaded context.
Oh, it has for me. In fact, unexpected mutation has been one of the most difficult bugs I've encountered.
I had a method, foo, which when given a NavigableMap (java), it would throw an exception. However, if you called the same method with the same Map, no exception would be thrown (because the map was mutated deep down).
Turns out, in the 84th circle of hell after a sub-sub-sub map view was created. A cute little said something like 'if key doesn't exist, associate key with 0'.
Because NavigableMap views are mutable in java this eventually made it's way all the way back to the parent map which, consequentially, caused the map to go down a different path when the same method was re-invoked.
It took a very long time to ultimately find that mutation due to the complexity of the code.
That's the beauty of Rust: it assumes single-threaded programs don't exist.
All types — including all 3rd party libraries — are forced to be either thread-safe or explicitly not compile if they'd cross thread boundaries. Nobody has "but I don't support multi threaded programs" excuse, so you may as well throw multithreading at everything.
Aren't the strong immutability/exclusivity guarantees specifically the reason you can pass around references without worrying if the referenced value will continue to exist?
These things are all related, but in my mind this part is more focused on the reference lifetime rules than the reference aliasing rules. As in, even if there's only one reference in the world, we still can't drop() the thing it refers to while the reference exists.
Without the exclusivity, it would be easy for a reference to be invalidated; the classic example is
let nut v: Vec<T> = ...;
let r: &T = &v[0];
v.push(...); // A
foo(r); // B
The reference passed at B might be invalidated by A, if the vector is reallocated. It’s the shared vs exclusive borrowing that defends against this, by flagging the above program as an error.
The code posted by dbaupp shows how violating Rust's immutability/exclusivity guarantee can also violate the lifetime guarantee. The implication being that the lifetime guarantee depends on the immutability/exclusivity guarantee (as supermatt claimed earlier).
It's a good example, but I'm still not enirely convinced that the implication is necessarily correct.
I wonder if a sufficiently smart compiler could not detect this particular special case and relax immutability/exclusivity in other cases where no references to the internal structure of a value exist, or if it could even keep whatever r is referencing alive to uphold the lifetime guarantee.
I realise the latter could have some undesirable side-effects wrt memory usage and it raises ownership questions if there are multiple such references. It's essentially what persistent data structures in functional languages do and it may not be a good fit for Rust.
The reason why I'm even interested in this is that Rust can sometimes feel restrictive in situations where having both mutable and immutable references or more than one mutable reference to the same thing is reasonably safe, such as in the local scope. Obviously people are working around mutability restrictions by using indexes/handles instead of references, but that doesn't seem any safer.
1. In order a sufficiently smart compiler to understand if this is okay, it would have to know how the Vec<T> type actually works. But Vec<T> is a library type, and uses unsafe code, which is kind of definition-ally code that the compiler cannot understand. So, even for this specific case, it is too hard to do so today. A human cannot even tell if this is okay or not, right now, because you'd need to see the code that's in the `...`.
2. Even if it could, it's unclear if it's a good idea. The more complex analysis a compiler can do, the harder it is to explain what it's doing to end-users. Imagine that, for example, we wrote this code, with the `...` being a case where it's safe, but then we modified it to a case where `...` was not safe anymore. What would that diagnostic look like? How could it be communicated to people in a useful way?
3. There's a tension here. If you make it work for very specific reasons, any small changes would likely cause it to break, whereas it breaking up front may lead you toward a design that is overall more robust.
> It's essentially what persistent data structures in functional languages do and it may not be a good fit for Rust.
> Rust can sometimes feel restrictive in situations where having both mutable and immutable references or more than one mutable reference to the same thing is reasonably safe, such as in the local scope.
You're making a lot of good points. I see that what I had in mind is probably not worth pursuing. At least the current rules are consistent and not too hard to understand.
And this is another point in favor of compacting GCs: non-GC languages are moving things around almost as much (i.e. growable arrays are ubiquitous, plus hashmaps may be based on them too etc). Without a GC the simple task of moving references becomes dangerous and requires the pain demonstrated to us by Rust (or leads to pervasively crashy software culture demonstrated to us by C++). With GC, this type of thing is not something the developer needs to care about.
This still happens in GC'd languages, for example, iterating over a collection while modifying it. In Java you may get a ConcurrentModifiedException. In Ruby and JS you get strange behavior. In Rust you get a compile error.
GCs don't magically allow you to not move things around. You still need to grow arrays and hashmaps. GCs also don't magically provide data race guarantes, which is where a good portion of the complexity with the borrow checker comes to play.
My point was the opposite: if data is to be copied about anyway, why not use GC which makes this copying safe (i.e. it handles reference updates) and extracts additional profits (compactization) while it's at it? As opposed to manual languages where objects are expected to be pinned (and memory-fragmenting) but often aren't, leading to headaches or runtime errors.
Do any languages somehow use their GC to update references when a growable array is reallocated? I'm not aware of any language that does.
Furthermore, its hard to see how it could. In the case of a simple copying/compacting collector the whole world is stopped and the pointer graph traced. But you obviously can't do that every time a growable array needs to be reallocated. I understand that there are more complex solutions that avoid that pause, but still not any I'm aware of that would be exploitable for reallocating an array.
The op is talking about compacting GCs and Go's isn't a compacting one.
Go's GC is a variation of a regular mark and sweep GC: the GC starts exploring the heap to find all reachable memory, and then clears the unreachable one, meanwhile compacting GCs move the reachable memory from one place (the “from space”) to another (the “to space”), which become the new heap. Theses two garbage collectors famillies are completely different.
And I was talking about GCs overall. The point was that Go's GC is pretty much the best GC out there used in major projects, designed to try to compete with systems programming languages and come as closely as possible to it. A lot of money has been put to make it as fast and deterministic as possible. It does not mean it is the latest in research, of course.
This is a real misunderstanding. Go's GC is not “pretty much the best GC out there” by most GC standards (it's already quite good and keeps getting better, though). And it hasn't been “designed to try to compete with systems programming languages”.
Go as a language was designed like that, and special attention was made to allow as much value types as possible, to allocate as much things as you can on the stack, thus reducing GC pressure. For the first five years of Go or something, Go's GC was actually pretty bad (it was probably the most basic GC you could find in any somewhat popular language), but it wasn't too much of a deal because you can avoid it most of the time when it goes in your way (much more easily than in Java for instance).
After some time, they decided to enhance it, but they were on a budget (no lots of money spent on it” actually), so because in go you can avoid allocating memory on the heap, they decided to focus on GC latency instead of throughput (if the GC's throughput isn't good enough for you, you better reduce your allocations).
Overall go is a pretty fast language, and it's an engineering success, but it's in spite of its GC and thanks to other parts of the language's design, not because Go's GC is exceptionally good (it's not, and if you read my link you'd understand how).
Thanks for the answer. I guess I have been misled, since Go proponents (including in HN) always argue the latest iteration sof their GC (which was discussed a few times here) had one of the lowest latencies of most production languages; and that is what made it suitable for many tasks.
Since I don’t have a sense on the timescales, I will take your word for it that it was the reverse.
Oh please. I've used Windows since 2000, and have seen lots of commercial C++ software (games, editors, Windows itself) give me that characteristic sound and dialog box. I stopped using Plasma desktop and KDE because of the crashes. "Error. Memory could not be written", segmentation fault, DLL error - I naively thought these were an inevitable consequence of using a computer, until I realized that the problem was that all this software was written in C++.
2. Windows was (is?) not coded in modern C++ but a bastard C or very old C++ from what I understand. Bugs were also mainly by third-party drivers and software. The industry for home user software had very bad quality, that’s is true, but that had nothing to do with C++. C, Fortran, Delphi and others were used too.
3. Videogames are in an industry where deadlines are more important than bugs and companies really don’t care about fixing stuff before release.
Do not mix up vulnerabilities (security) with normal usage (generally buggy SW), which is what I talked about. I explicitly called out vulnerabilities as the actual problem that calls for safe languages.
For instance, you can have a library that works perfectly for any non-malicious/valid PNG image, yet have vulnerabilities for invalid, crafted files.
In fact, this is almost always the case for any product. Only really bad quality software has issues with the happy expected path. But many more will have vulnerabilities in the unexpected paths.
Why the "commercial"? Are other software types not worthy to be considered in this? Are open source non-commercial projects inherently worse code and more crashy? Should we not look at all kinds of software, when talking about programming languages in general?
It has been my experience, that even the well known softwares written in C (not sure about C++) are indeed kind of crashy. Example: Open Broadcast Studio, used by many people to stream on various platforms. It is possible there to go into the settings and change them so that OBS crashes and cannot be restarted again, unless you guess, that the configuration is faulty and OBS is unable to deal with a faulty config. You can only get it to run again by removing the config or reinstalling. Great quality. VLC is another example of such crashy software. Not only did it crash many times in the past, but also cannot properly deal with unplayable files in a looping playlist and will simply bring all your cores to 100% usage.
So far some anecdotes from the well known C software world. I am not sure, whether the people in the respective community consider these softwares to be good quality.
Apart from maybe seL4 (which I only know by reputation, and was translated from a formally verified spec) I can't name any software that I would call reliable. If we ever want computers to actually work, the whole stack needs to be flensed down to the metal and replaced, and I hope Rust is the language to do it with.
I am just getting into rust myself - following from that 30 min article a few days ago.
From my understanding, the comments in a few of the examples are misleading. Author is stating things like "// mutable borrow occured" where there is no mutable borrow occuring. My understanding is that there is an implicit mutable borrow where the methods are being called (e.g. `original_owner.push('.');`) which is raising the errors
Can anyone with more experience confirm/deny that this is the case, as I want to be sure I am not misunderstanding this
Many thanks. I think most of the confusion is to do with the comments, some of which the author has resolved so now my comment doesn't make much sense. But in the context of the original statement, thanks for your confirmation :)
Thanks. Just a quick look through and it makes more sense now. Theres also a few places where you comment saying a borrow occured before the borrow occurs (in others the comment appears afterwards, as I would expect) - i.e. the comments are in the wrong place / inconsistently placed.
Obviously it doesnt affect the output of the code, but as a tutorial its probably better to correct that.
EDIT: further nitpicks, i would advise against calling the borrower the "borrowing owner", and other such terms - owner has a very specific meaning when it comes to memory management, etc.
Thanks for the feedback. I just quickly went through the code examples and tried to sanitise them as much as I can now. As soon as I have more free time on my hands, I will revisit and update the naming based on your feedback. Thanks!
That solves ownership in terms of making sure the memory isn't deleted while something is owning it but, at least from my quick read, doesn't solve ownership issues related to concurrent access that Rust's ownership model does solve. Is that reading correct?
You are right. But it's not needed in Lobster, because threading model is different. It doesn't have concurrent access problems:
> Lobster built-in multi-threading functionality ... is different from multi-threading in most other languages, in that it does not allow threads to share memory or any other VM state.
> Ownership-based memory management is statically elided reference counting.
One difference I'd want to highlight here, which sometimes trips people up: Taking a reference to a Rust object has no effect on where that object's destructor runs. It can make the program fail to compile, if the reference lasts too long. But if the program keeps compiling, then the object's destructor is always running at the same place it was before.
Some commenters mention Rust has compile-time data-race checking. Is this correct? From what I did understand it will only enforce a mutex to be locked before you can access specific data. However if there is no mutex, data-races won't be detected at compile time?
It is correct. Rust prevents all data races at compile time, unless you write incorrect unsafe code. This is why people try to write as little unsafe as possible. This has nothing to do with Mutexes; mutexes are provided by the standard library and the language knows nothing specific about them.