Hacker News new | past | comments | ask | show | jobs | submit login
Escape analysis hates copy elision (2021) (quuxplusone.github.io)
122 points by pornel on Oct 14, 2023 | hide | past | favorite | 53 comments



One of the escape analysis examples in the article got me thinking. Turns out this is one of those cases where marking a local variable const _does_ result in different code.

See this godbolt example https://godbolt.org/z/nbzd59E33

As I understand it, performing a const_cast itself isn't undefined behavior, which is why the compiler cannot assume function f doesn't modify the local. However, if we mark the local variable as const, the compiler is able to assume no modification will occur (because that would be undefined) and optimize the return.


It's unfortunately easy to defeat this, e.g.

  int y = 42;
  const int x = y;
is enough, even though `x` is still a `const` local variable.

https://godbolt.org/z/7vre49xb1


Fascinating. Does anyone know why x escapes here? Is it an optimization miss? Because to my untrained eye modifying x should still be undefined behavior.

Does the way x is initialized make it not "const-y" enough?


I tried gcc, clang, msvc, and icc. They all generated the load and either an add or an inc. https://godbolt.org/z/han78n9zM If we define `f` to const_cast away the const and add 10: https://godbolt.org/z/1dso8ojWG all the compilers agree that we should get 53. However, make `y` const, so we have

  const int y = 42;
  const int x = y;
and now modifying x suddenly becomes undefined behavior, and x + 1 returns 43 https://godbolt.org/z/cqbGasrMM Similar to how the original assembly showed making `y` const removed the reload and increment/+1 of x.

So, it seems that all these compilers agree that modifying `x` is undefined behavior if `y` is const, but is defined if `y` is mutable.

I am not sure why this is.

I may have misunderstood, but I didn't get the impression this should be the case from reading: https://en.cppreference.com/w/cpp/language/cv

My best guess is that for some reason, `x` is a constant object only if initialized itself via a constant object. But I couldn't find anything saying that is the case. I asked on the #learn channel of cppslack.


I don't understand what is defeated in your example.


I edited to add a godbolt link, but repeating here: https://godbolt.org/z/7vre49xb1

In my example, even though `x` is a `const` local, the assembly still shows that it reloads `x` and adds `1` if we assign to it from a non-const local `y`.


Isn't that expected since 'y' is still non-const though? In your example, I'd expect 'x' to not escape, since a const-cast of it is undefined.


Unfortunately, `x` does escape, merely because we initialized it via assigning to it from a non-const.


Interesting point! That's my understanding as well.


> Suppose S::make and take_ownership were secretly colluding, like this:

     std::unique_ptr<S> *global;
     
     std::unique_ptr<S> S::make() {
         std::unique_ptr<S> origp = std::make_unique<S>();
         global = &origp;
         return origp;
     }
     
     void take_ownership(S *rawp) {
         delete rawp;
         *global = std::make_unique<S>();
     }

>The compiler cannot rule out this possibility

Why can't the compiler rule out that possibility? That looks like undefined behavior code to me, so thus the compiler could rule it out. S::make() is storing a pointer to a local variable (origp), and then that pointer is being accessed after S::make() has returned. I'm pretty sure that's undefined behavior.


The usual justification for having a concept of "undefined behavior" at all is specifically to allow compilers to "rule out this possibility" so they can make this sort of optimization.


At least that's the modern day retro-active justification. The historical origins of undefined behaviour in eg the C standard are murkier.


Related reading: this year the author submitted a GCC bug-report on this topic, linking back to the 2021 blog post. The bug appears to be still unresolved.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109945


Most escape analysis these days is interprocedural, and as long as it can see all the things that happen to the variable, it is fine.

At the very least, most compilers support doing interprocedural escape analysis, whether it's on by default at basic optimization levels or not.

In that sense, this is not a very hard problem - it was solvable in the 70's.

If you give the compiler all the source, or have libraries compiled as fat lto objects, etc, it should work fine.


Interesting. Thin lto wouldn't be enough?


It should be, actually, as long as you give it all the source code or give it object files with the summary data in them.

The summary mod/ref info should be enough to get it right, because it should say the function does not modify the argument, and that should get propagated through.

The tricky case is where it sometimes modifies the argument - the summary data is not flow sensitive, IIRC

Flow sensitive algorithms are still really expensive worst case (N^3 minimum, sometimes exponential), but could be made practical here using BDD's. Most compilers still do not use BDD's to represent this sort of large binary decision data, even though they are very good at it. It's an area where research always felt more advanced than production (IE even if you don't go whole-hog on datalog, the data structures are still very useful for this sort of otherwise-memory/time intensive data)

In any, fat LTO should defintely be enough.


Does anyone know what guarantees Rust functions have regarding escape analysis? In particular, is the `x` reference in `fn foo(x: &T)` guaranteed to not outlive the function, or would it be legal for `foo` to `transmute()` it into a `&'static T` and have to escape into a global?


It is unsound to transmute `&'a T` into `&'static T`, but it is not UB - as long as all of the subsequent uses of the transmuted reference obey the "real" lifetime of the original reference:

    fn example<'a>(r: &'a mut i32) -> &'static mut i32 {
        unsafe { std::mem::transmute(r) }
    }
    
    
    fn main() {
        let mut x: i32 = 5;
        let ptr: &'static mut i32 = example(&mut x);
        *ptr = 6;
        println!("{x}");
    }
(because it's unsound, it's considered wrong to do this - you should not intentionally write functions whose types are lies, and this one definitely lies, so it should be marked `unsafe` - but this is not automatic UB)

https://play.rust-lang.org/?version=stable&mode=debug&editio...

You can run through Miri and confirm there's no UB even though we're modifying `ptr`, whose lifetime has been extended beyond the length of the function.

However, Rust does have extra guarantees here which make this irrelevant to the pessimization problem in the linked article - you cannot ever legally convert a `&T` into a `&mut T` - this is always UB. This means that Rust guarantees that `example` does not modify `x` (unless e.g. it contains an `UnsafeCell`, like a `Mutex`'s contents), and so it does not need to defensively reload its value.

That is to say: Rust, just like C++, makes it legal (but frowned upon) to "leak" a reference beyond the stated lifetime it's provided as. But unlike C++, it is (always!) illegal to "upgrade" a `&T` into a `&mut T`, and thus the fact that it escapes does not hinder other optimizations.


Could you please expand on this last point? Would it not be the same in case of C++’s `const`s?


In C and C++, `const` on pointers/references is basically just a comment to programmers - it is part of the type, but doesn't "mean" anything to the abstract machine; the rules don't treat const / regular references/pointers differently, they just say that the types only let you mutate through a mutable pointer.

Obviously, good code should treat it as more than just a comment - using `const` correctly clarifies intent and makes it possible to stay sane as a C++ developer, but the abstract machine doesn't care.

In C++, you can basically always `const_cast` a `const T&` into a `T&` and then modify it without causing UB. A function that accepts a `const T&` is just pinky promising that it will be polite and probably not do that.

It is only UB if the underlying object is "actually const", and even then, it doesn't cause UB until you actually perform the mutation; creating the mutable reference itself is perfectly fine.

For example, the following is perfectly legal:

    int& upgrade_to_mut(const int& x) {
      return *const_cast<int*>(&x);
    }

    int x = 5;
    const int& x_ref = x;
    int& x_ref_mut = upgrade_to_mut(x_ref);
    x_ref_mut = 6;

it's only invalid if the object that is pointed at is const, as in:

    int& upgrade_to_mut(const int& x) {
      return *const_cast<int*>(&x);
    }

    const int y = 5;
    const int& y_ref = y;

    int& y_ref_mut = upgrade_to_mut(y_ref); // it is actually legal to produce y_ref_mut, but we cannot modify it


    y_ref_mut = 6; // this is UB: cannot modify a const object 'y'
The difference is that in Rust, "mutation capabilities" are part of references, and so you cannot create them out of nowhere, that would be UB. But in C++, mutation capabilities are part of the object being pointed at, so as long as they happen to be there when you perform the mutation (e.g. you're not modifying a string literal or a variable declared `const`) then there's no problem.


It's not entirely true that "const" is "just a comment," depending on the use case. In machines with super-limited RAM you can use const on globals to tell the compiler "put this in .rodata"

In other words, "const" (in a global context) can tell the compiler "you don't have to copy this to RAM, just read it directly from non-volatile storage." Obviously, that would be undesirable on a desktop computer, but if you're dealing with a wee little microcontroller, it's very helpful.


`const`-as-comment is specifically limited to pointers and references - `const` on objects definitely does change semantics (it is always UB to attempt to modify a `const` object).

Another good example is string literals (except when initializing a non-const `char[]` variable), which are often allocated in read-only data in the same way, since they are const objects too.


The comment specifically referred to references and pointers whereas your example does not.


I don't agree it is just a comment to programmer. CppReference says,

> Modifying a const object through a non-const access path and referring to a volatile object through a non-volatile glvalue results in undefined behavior.

and both compiler have tried to take advantage of this in the past.


That is saying exactly the same thing that I said; the qualification of the pointer is irrelevant; you get UB when modifying a const object. In the code snippets above,

   int x; // not a const object, so can be freely modified
   const int y; // is a const object, so cannot be modified
C/C++ pointer provenance doesn't include information about constness, so it doesn't matter "how" you got a mutable pointer to `x`: you're always allowed to modify `x` through that pointer, even if the pointer "came from" a const reference.

The reason for mentioning a "non-const access path" is that the type system forbids you from modifying through a const access path in the first place, so the program would already be rejected if you tried that.

I'm not saying it's a good idea to go around dropping const qualifications; `const_cast` is mostly evil and should be avoided. But at the level of the abstract machine, it's a no-op, even when going from const -> non-const, other than changing the type of the provided pointer.

The benefit of `const` is that if you don't use C-style casts that discard constness, and you don't use `const_cast`, and you don't use `mutable` or other type-unsafe or const-unsafe features, it's not possible to accidentally obtain a non-const pointer to a const object. Thus C++ actually helps you avoid this UB pretty well. But the fact that general conversion from const to non-const is permitted reduces the kinds of optimizations that can be performed.


Thanks! Do you know if the Rust compiler is able to pass this information to LLVM somehow?



It does pass this information to LLVM, in the form of the `readonly` attribute. This seems to be a bug in LLVM that does not optimize the function propely, I don't know why.


Regardless of escape analysis, this problem doesn't exist in Rust, because the language features that lead to this problem don't exist either.

In Rust, moves are built in into the language. Types with destructors can't have implicit copying, and there are no copy constructors. There is no moved-from state of objects, and destructors never run redundantly. Box (Rust's unique_ptr) is statically guaranteed to be always non-null and have a single owner at all times. So the language has no equivalent of omitting std::move, and has no semantically-observable copy to elide.

See https://rust.godbolt.org/z/PnWv5d6n9 and https://rust.godbolt.org/z/Efosx3Wea

Rust does emit memcpy for its moves, and for that it has got some specific LLVM improvements: https://khei4.github.io/gsoc2023/


No, there is no guarantee that the reference will not be valid after the function returns. The only guarantee is that the reference will not be valid after the caller of the function modifies the referent (outside of an UnsafeCell), deallocates it, or creates a mutable reference to it. So there still is some escape analysis possible, in that any escape is scoped to the next modification.

Lifetimes do not affect whether or not something is UB on the language level; they only dictate which operations are safe and which are unsafe, and thus they act as a "social contract" for library-level UB.


Rust borrows must not exceed the lifetime associated with the borrow. By default that indeed means the borrow/reference cannot be moved into a context that outlives the function's execution, so it indeed can't be moved into a static global (unless `unsafe` is used explicitly)


But unsafe could be used inside the (rust equivalent) body of make and take_ownership without the compiler knowing. So would the rust compiler need to take this possibility into account when optimizing or not?


No, that would be undefined behavior, as would other transmutations you could do with unsafe like converting it to a mutable reference.


Transmuting lifetimes is one of the operations explicitly allowed for the std::mem::transmute() function [0]; lifetimes are only there to help safe code, and they have no effect on language-level UB. In fact, the 'static reference will be valid until the caller otherwise invalidates it.

[0] https://doc.rust-lang.org/1.73.0/std/mem/fn.transmute.html#e...


You're allowed to extend lifetimes via transmute, but unsafe code must adhere to the invariants of references as laid out by the Rustonomicon:

1. A reference cannot outlive its referent

2. A mutable reference cannot be aliased

https://doc.rust-lang.org/nomicon/references.html

Therefore, any use of transmute that extends a lifetime is UB if it causes the reference to outlive the referent. And because it's UB, the backend is going to assume the invariant holds universally and will optimize accordingly.

The Nomicon also has a section on generating lifetimes via transmute, which it calls "unbounded lifetimes": https://doc.rust-lang.org/nomicon/unbounded-lifetimes.html


> Therefore, any use of transmute that extends a lifetime is UB if it causes the reference to outlive the referent.

By itself, a transmute of a currently-live reference cannot make it outlive its referent, in the sense used by the Rustonomicon. As it says later [0], a reference is alive until the point when it is last used, and no further.

Thus, with unsafe code, the lifetime of a reference can extend past the point when it stops being alive. As a consequence, extending the lifetime can never cause UB by itself: only using the reference after it has been invalidated can cause language-level UB.

Since language-level UB is based on liveness rather than lifetime, the compiler must assume that a reference passed to a function is valid even after it returns, up until the caller invalidates the reference by performing an operation incompatible with it remaining alive.

[0] https://doc.rust-lang.org/nomicon/lifetimes.html#the-area-co...


> only using the reference after it has been invalidated can cause language-level UB.

When a reference is 'used' is much broader than you might expect; https://github.com/rust-lang/rust/issues/52898 is a SIGILL from merely creating a null reference, never 'using' it.


That's not about 'using' a reference, it's about constructing an invalid value – no different than doing transmute::<u32,char>(0x110000). See https://www.ralfj.de/blog/2020/07/15/unused-data.html for an explanation why it's UB.


That's true, but the claim was "it's ok to extend the lifetime of a reference beyond its referent so long as you don't use it after its referent has ceased being live"; what definition of 'used' the compiler might rely on is relevant.


What is UB in unsafe rust? You would be taking the addresses and then using the address of a variable that is still in scope. This assuming that rust has the same guaranteed elision as C++, extending the lifetime of an automatic object into the caller.


I believe the parent is saying "that would be undefined behavior" in answer to your question "would the rust compiler need to take this possibility into account". The compiler would not take this possibility into account, because the fact that references cannot outlive their referents is an invariant of the language that can only be broken by incorrect unsafe code. (To put it another way, in any context where the compiler would take this into account, it would do so regardless of the existence of the unsafe code, because unsafe code still ultimately has to uphold all the same invariants as safe code.)


Sure, but the reference doesn't outlive the referent in c++: the language guarantees that the local object in make is exactly the same object in the caller's caller.

The question is whether rust gives this object identity guarantee as well and, if it does, whether there are other rules (like uniqueness of mutable refences) that still allows the optimisation.


But there could be an object with the same lifetime as the function passed to both f and h, in the article’s example — could they not collaborate still in safe Rust? Or does the compiler know about that?


I'm afraid I don't know about Rust, but D has the scope parameter storage class for this. It enables the following compiler-enforced guarantee:

> The parameter must not escape the function call (e.g. by being assigned to a global variable). Ignored for any parameter that is not a reference type.

https://dlang.org/spec/function.html#scope-parameters

See also: https://dlang.org/blog/2023/10/02/crafting-self-evident-code... , https://news.ycombinator.com/item?id=37748543



Good call, have updated my links.



You can't do that. Storing the reference as a &'static T will be ok for the lifetime of your function, there is no guarantee that the thing the &'static T points to will be alive after that.

https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaP...


No, the following would be valid (if a little pointless):

fn foo(x: &'a T) -> &'a T { x }


Your example changed the type of the function, though. I'm assuming there is no `&T` in the function's output type.


Oh sorry; I had misunderstood the question.


[Deleted - responded to the wrong comment. I meant to respond to IAmLiteralyAB.]


Sorry, can you elaborate what or who you are replying to?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: