Understanding lvalues and rvalues in C and C++

bjoernbu · on Dec 15, 2011

Really awesome writeup. Easy to understand, very clear and full of valueable information. However, I really wonder aout one thing (not just because of this artcle but it reminded me):

When exactly are move semantics and rvalue references useful, APART from using functions that return complex types. I am currently working on a codebase and I am really unsure if move semantics are something I really want to use. Sure, I'd love them for new projects, but if there are naming conventions and even conventions of passing result objects by pointer, not by reference (not my favourite rule), I don't think I'd like mixing styles. I think differently about starting to use "auto" and lambdas, but this is not about C++11 in general.

So actually I really wonder if there is a just case of rvalue references other than move constructors and returning by value. Any pointers?

lcapaldo · on Dec 15, 2011

They're used to implement perfect forwarding: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n138...

Someone · on Dec 15, 2011

A good attempt, but I spot an error:

"They’re not lvalues because both are temporary results of expressions, which don’t have an identifiable memory location (i.e. they can just reside in some temporary register for the duration of the computation)."

The problem is that lvalues _can_ "just reside in some temporary register for the duration of the computation". Any decent optimizing compiler will treat simple loop counters that way.

I am not even sure the C++ standard even mentions registers.

qdog · on Dec 15, 2011

Maybe 'register' wasn't abstract enough, he means something like (in psuedo-assembly):

  ldr r1,4     ;Load constant 4 into register 1
  add r2,r2,r1 ;Add contents of r2 and r1, store in r2
  ldr r1,5     ;Load constant 5 into register 1

where the value is only temporarily available for the single calculation. Even without thinking of registers, you can refer to the loop variable, but you can't refer to numbers in your operations.

Maybe he could update this line for clarity, though.

davidtgoldblatt · on Dec 15, 2011

Rvalue references allow you to "steal" the resources from an object when it's safe to do so. One use for this is to optimize functions returning large complicated data structures, but that's just because memory is a particular type of resource.

With rvalue references, you can move around objects that should be moveable but not copyable. Consider an object representing a database connection - copying it isn't meaningful, but moving it ought to be.

simias · on Dec 15, 2011

It's a good article, however I've always thought lvalue meant "left-value" (i.e. a value that can appear as the left part of an assignment) but this article says it stands for "locator value". Who's right? And then what does rvalue stand for if not "right-value"? Register value? I find no definite answer on Google.

Sharlin · on Dec 15, 2011

As the article points out, the original meaning of the "l" in "lvalue" indeed was "left". However, in C and C++ there are lvalues that cannot appear as the left-hand-side of assignment (const variables, arrays, functions) and rvalues that can (in C++ you can say, for example, C() = C(); for some class C because the assignment operator is just a member function, and member functions can be called for rvalues.)

What more accurately grasps the definition of lvalues and rvalues in C and C++ standards is that the former have an individual identity (concretely, they are permitted as the operand of the address-of operator &) and the latter don't. (Except don't get me started on the fact that in C++ one can overload operator&...)

jonsen · on Dec 15, 2011

The word lvalue was originally coined to mean "something that can be on the left side of an assignment."

Bjarne Stroustrup, The C++ Programming Language, 2nd ed., p 47.

eliben · on Dec 17, 2011

Here's a telling footnote from the C99 standard:

"The name ‘‘lvalue’’ comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an object ‘‘locator value’’. What is sometimes called ‘‘rvalue’’ is in this International Standard described as the ‘‘value of an expression’’."

mrcharles · on Dec 15, 2011

This was my understanding as well. My mind abstracted "right-value" to "anything that can be assigned." If you look at the relations of lvalue and rvalue to the assignment operator (=), you can see that it's a consistent definition too.

It also follows the same rules in that 'rvalues are defined by exclusion.'

divtxt · on Dec 15, 2011

Everyone here agrees that it's left/right. (even if that's not correct)

Left/right is so much easier to both remember and understand the terms.

16s · on Dec 15, 2011

That's how I've always thought of it as well.

apaprocki · on Dec 15, 2011

Move semantics opens up a new age where higher level STL-like containers (and types built from them) can perform as well as raw C pointer based implementations. The last real roadblock is the elephant in the room: compiler support. I've argued with John Lakos that all the guys at work on the C++ committee should take a break from the standards process now and devote themselves for a time to getting C++11 fully supported into all main compilers. Only having move semantics in gcc is not good enough :/

shin_lao · on Dec 15, 2011

It's there in clang and Visual Studio 2010.

apaprocki · on Dec 15, 2011

Only leaving CC (Oracle), xlC (IBM), aCC (HP) to name a few. I got 99 platforms and x86 is only one. :)

Alaric · on Dec 16, 2011

It might be more beneficial to try to get them to improve llvm support for their respective os+archs. It'd be less diversity of compilers but is probably a better investment for them than shoe-horning a poor/incomplete C++11 implementation into their aging compilers.

apaprocki · on Dec 16, 2011

Yeah I'm pushing this route as well, but unfortunately the "hard to reproduce" part is all of the chip specific optimization. You could imagine that if their optimization phase(s) look nothing like llvm it would be really hard to get them to buy into that. I've had more success getting vendors to contribute gdb support (IBM has). I'll keep trying, though.

nitrogen · on Dec 15, 2011

On your large projects that must support Oracle, IBM, and HP's compilers, what prevents you from compiling your C++11 code with gcc and linking it in? Is it to difficult to maintain a consistent external ABI? Does gcc not support the other ABIs and/or mangling conventions?

apaprocki · on Dec 15, 2011

The generated code is not compatible and there are other issues. We have some stubs vs dwarf debug information issues with gcc. The code generated by gcc is not as optimized as the vendor supplied compilers. This is if you are telling the compiler that you only want to support sparcv8+ or pwr5+ so it can take advantage of the newer CPUs. And even if gcc was the same or better, there is the issue of support. We only use the vendor compilers because we meet with them regularly and they will fix any bugs immediately (at least start working on them) and are open to enhancement requests. I'm not aware of any company willing to provide awesome service for gcc on these platforms.

Edit: Not to say we don't use gcc. The entire codebase does of course build using gcc on all of the platforms.. it is mainly the support issue and tweaks needed to build the binaries the way we want on each architecture.

lysium · on Dec 15, 2011

Mangling is not standardized and every compiler does it differently. In general, you can't link files compiled by different C++ compilers.

apaprocki · on Dec 16, 2011

And to make things even more fun, each individual compiler usually comes with baggage it has collected over the years. In xlC alone the -qnamemangling option can be set to: ansi (don't let it fool you), v11, v10, v9, v8, v7, v6, v5, v4, v3, or compat so even code build with the same compiler isn't always compatible.

qdog · on Dec 15, 2011

Yes, and ARM is everywhere.

jpdoctor · on Dec 15, 2011

Terrible start point:

An lvalue (locator value) represents an object that occupies some identifiable location in memory (i.e. has an address).

rvalues are defined by exclusion, by saying that every expression is either an lvalue or an rvalue.

So a pointer represents "an object that occupies some identifiable location in memory", meeting his definition of lvalue, and yet is a perfectly valid rvalue.

I sentence this person to be a teaching assistant for one term.

Edit: I love the first two replies to this comment:

1. No, lvalues are never rvalues by definition.

2. all lvalues are rvalues,

Sharlin · on Dec 15, 2011

No, lvalues are never rvalues by definition. In the C and C++ language standards, rvalues and lvalues are not defined based on which side of the assignment operator they may appear on. The assignment operator takes an rvalue as its right operand, however, lvalues are also accepted there because there's an implicit lvalue-to-rvalue conversion taking place, as pointed out later in the article.

tedunangst · on Dec 15, 2011

all lvalues are rvalues, so yes, an lvalue is a perfectly valid rvalue.

oh, maybe you meant an rvalue that happens to be a pointer? still makes sense, the pointer is not the same as what it points to.

[In light of Sharlin's comment, I should edit this to say that "lvalues are rvalues" is shorthand for "convertible to", the significance of the difference being what you make of it.]

pillbug88 · on Dec 15, 2011

I find the standard to be very clear and concise regarding lvalue/xvalue/prvalue [2011 3.10.1].

I wonder if teaching from the standard is a better way of approaching this, ie: start with the standard, then just explain what it means.

zokier · on Dec 15, 2011

Yes, many (if not all?) lvalues are (or actually could be) perfectly valid rvalues. But no value is both at the same time. There is nothing odd in having something that can be either A or B, but not both at the same time. So a pointer can be a lvalue or a rvalue. If the pointer is at a "identifiable loaction in memory" then it's a lvalue, and if it's not, then it's a rvalue.

msarnoff · on Dec 15, 2011

Interesting C++ tidbit: the ternary operator can yield an lvalue, like so:

  int foo = 0, bar = 0;
  (cond ? foo : bar) = 8;

This doesn't work in plain C.

koenigdavidmj · on Dec 15, 2011

But it does with pointers in C:

  int foo = 0, bar = 0;
  *(cond ? &foo : &bar) = 8;