Under the hood of C++ lambdas and std::function

doomrobo · on March 28, 2016

If anyone is curious, as of C++14 you can `move` variables into a lambda scope (like Rust's move ||) as follows

    Foo foo; // Foo implements a move constructor
    auto l = [cap = std::move(foo)](){ do_stuff(cap);} // Make the lambda
    do_stuff(foo); // ILLEGAL. Cannot use moved value
    l(); // Run the lambda

svalorzen · on March 28, 2016

Note that `do_stuff` may not necessarily be illegal, since a moved value is required to be valid but in an unknown state. So maybe the result of `do_stuff` will be unspecified and unknowable (unless it requires some specific assumptions about the data), but any operations on `foo` should not give rise to obvious errors.

Kristine1975 · on March 28, 2016

>// ILLEGAL. Cannot use moved value

Actually that depends on the implementation of "Foo". After all "foo" has not been destroyed yet.

Edit: An instance of std::unique_ptr for example points to nullptr after having been moved to another instance (C++14 Standard §20.8.1p4.2).

rjeli · on March 28, 2016

At this time, what can Rust accomplish with the borrow checker that C++ can't accomplish with move semantics?

Too · on March 28, 2016

In c++ if you pass something as reference there is nothing preventing the receiving function from keeping a pointer to that reference even after returning.

Right now the only way to enforce this is by convention: 1. move/unique/shared_ptr=receiver allowed to keep pointer, 2. raw ptr=totally banned, 3. reference=caller must outlive callee and called function is not allowed to keep pointer after returning. The last scenario here is not enforcable by the compiler.

bjz_ · on March 28, 2016

Rust's moves are implicit, making it easier to write code that doesn't copy heap allocated memory, and you can't get 'use after moves' like in C++ - the type system will pick that up.

rjeli · on March 28, 2016

Oh, having only read about and not used recent C++ practices, I assumed use-after-move would be caught at compile time. Looking around a bit, it seems there's not even a warning for this; is there a reason? It seems like it would be very useful.

catnaroek · on March 28, 2016

In C++, "moving" and "copying" are a matter of class design. While you are expected to use move (resp. copy) constructors and assignments to, well, move (resp. copy); the language doesn't rule out using them for other purposes.

In Rust, assignment and argument passing always move, "moving" always does the right thing (make a shallow copy and invalidate the original object), and you can't override this behavior. Furthermore, Rust splits what C++ calls "copying" into two concepts: shallow copying (which is the same as moving, except the original object isn't invalidated, and the type checker guarantees you can only do it when it makes sense) and deep cloning (which may be expensive and needs to be explicitly requested by the programmer).

This is why Rust can track which objects are no longer usable.

Animats · on March 28, 2016

Move is a template level concept, not a language concept. The compiler has no idea that something was moved, rather than assigned. Some implementations null out the "moved from" reference, so that a use after move will dereference null and crash the program.

I've referred to the attempts to fix C++ ownership semantics with templates as "wallpapering over the mold".

girvo · on March 28, 2016

> is there a reason

Yeah, basically that it's quite difficult to implement especially with C++'s baggage and undefined behaviour, from what I remember

loqi · on March 30, 2016

Rust disallows mutation of multiply-referenced data, which prevents things like iterator invalidation and data races.

sebastic · on March 28, 2016

Compiler will not warn this:

    MovableObject a, b, c;
    b = std::move(a);
    c = std::move(a);

`c` may contain garbage.

BTW the "move trait" (as its called in rust) is a related but orthogonal concept to its borrow checker.

catnaroek · on March 29, 2016

In Rust, moving isn't a trait. It's the default, non-overridable behavior of assignment and argument passing. And everything is movable.

jb1991 · on March 28, 2016

>Lambdas are also awesome when it comes to performance. Because they are objects rather than pointers they can be inlined very easily by the compiler, much like functors. This means that calling a lambda many times (such as with std::sort or std::copy_if) is much better than using a global function. This is one example of where C++ is actually faster than C.

That's insight I've never seen anywhere before.

Kristine1975 · on March 28, 2016

It's also wrong. If the compiler inlines e.g. std::sort, it doesn't matter whether the comparator is a lambda or a pointer to a function: The compiler knows what it is and can inline it as well. The only exception is when the function is defined in a different compilation unit. Then the compiler has no access to its definition and thus cannot inline it into std::sort.

The "C++ with lambda is faster than C with function" comes from C's qsort function that takes a function pointer to a comparator and is not inlined by the compiler. This can result in it being slower than std::sort.

BTW since lambdas without capture decay to function pointers, you can pass such a lambda to qsort.

gpderetta · on March 28, 2016

In principle there is no difference, in practice until relatively recently some popular compilers would fail to inline calls via function pointers. The reason is that the inliner would need to run after constant propagation and after the HOF function itself, i.e. sort, has been inlined. With function objects and templates you get a distinct instance of sort for each callback type so the target is known statically even before any optimisation starts.

stinos · on March 28, 2016

I think there's truth in it but I'd still take it with a grain of salt, more like can be faster under the right circumstances. Inlining can not only be faster because there's no need for the function call prologue/epilogue etc but, probably more important, because the compiler sees all of the code and this opens doors for more optimizations. That's really important to realize; I recommend looking at the assembly generated by the compiler for loops etc from time to time. It's simply amazing what can be done. So if in similar code C++ can take advantage of it while C can't because it has to do an actual function call then the C++ version has advantages and might very well be faster (whereas otherwise both would produce roughly the same result and speed I guess, unless C++-only features like virtual functions etc used).

exDM69 · on March 28, 2016

> So if in similar code C++ can take advantage of it while C can't because it has to do an actual function call then the C++ version has advantages and might very well be faster

This is a matter of compilation units, not C vs. C++. If you put qsort() in a header file, and call it with a function pointer in the same compilation unit, it will get inlined and work exactly the same way as std::sort (which is in a header file since it's a template).

You can pretty easily verify this yourself by doing a C++ template and a C function with function pointers (in a header file!), compile and inspect the resulting assembly code.

Link-time optimization may relax the compilation unit requirement, but it's not very widely used yet.

sebastic · on March 28, 2016

His logic is wrong. It's not because they are objects, it's because std::sort and friends are function templates. They know what they are calling at compile time. Templates allow for compile-time polymorphism like this.

Because of compile-time polymorphism, you can pass in prvalue function pointers and objects alike and the compiler will be able to inline.

C doesn't have templates, therefore it doesn't have compile-time polymorphism, therefore it's much harder for the compiler to inline.

Compilers like GCC and Clang can inline in some runtime-polymorphism circumstances. See -flto and -fdevirtualize.

exDM69 · on March 28, 2016

> C doesn't have templates, therefore it doesn't have compile-time polymorphism, therefore it's much harder for the compiler to inline.

A C compiler can do the same inlining and optimizing as long as the called function (e.g. qsort) and the function that's pointed-to (e.g. the comparison function) are in the same compilation unit.

C++ templates do not add any magic here, std::sort() is simply faster because it's in a header file and qsort() is not. If you pass a function pointer to a function in a different object file to std::sort, it won't have a lot of advantages over qsort in terms of optimizations/performance (it still does know sizeof(T), though).

Link-time optimization should make standard qsort-type functions perform similarly to std::sort.

sebastic · on March 28, 2016

Also C++ is generally as fast as C. If anything, it's faster thanks to increased opportunities for inlining.

Whoever says C++ is slower than C is misinformed and/or overgeneralizing.

Professional C++ programmers know this already though.

jb1991 · on March 28, 2016

Well, much of C++ involves virtual function calls, and those are indeed slower than non-virtual calls. C doesn't have virtual calls, so that is perhaps one reason why C is generally assumed to be faster. That doesn't mean you must use virtual functions in C++; in fact, I don't use inheritance in my own code whatsoever, therefore I avoid the virtual calls but more importantly the restrictions that can creep into a tight OO hierarchy.

And std::function, from what I understand, is generally not as fast as a function pointer, a stored lambda, a function object, since it must do some run-time checks to see how to call what it is storing.

kabdib · on March 28, 2016

However, every sufficiently complex C program re-invents vtables (or has an even worse generic dispatching system), so you're often no worse off . . .

mike_hock · on March 28, 2016

The only virtual call that you necessarily get with polymorphism is the destructor call.

Whenever you need to make a function virtual in C++, you'd need to store a function pointer in an equivalent C design, somehow.

jb1991 · on March 28, 2016

Are you suggesting that C++ virtual functions have the same overhead as a C function pointer? I've never heard that; with the virtual call, there is the lookup + the invocation; perhaps a C app that stores function pointers in a hash map also has this sort of lookup, is that what you mean?

mikeash · on March 28, 2016

Vtable lookup in a typical C++ implementation is just loading a fixed offset in an array. In C terms, it's basically doing object->vtable[42]. In C code with function pointers, you're typically performing that exact same operation. You can go slightly faster if you store the function pointer in the object directly, at the cost of increased memory consumption per object if you have more than one function pointer.

raphaelj · on March 28, 2016

Don't forget about the templates too.

A generic collection is usually faster when implemented using templates instead of void*.

jakub_h · on March 28, 2016

Unless you use a DataDraw-like approach, I presume. I can't see templates doing similar amounts of data rearrangements.

Kristine1975 · on March 28, 2016

You'd be surprised: http://bannalia.blogspot.de/2015/09/c-encapsulation-for-data...

jakub_h · on March 28, 2016

Setting aside the cringe I experience whenever supposed principles of "OOP" are mentioned by the author, that page painfully reminds me of the that old joke about bolting four extra legs onto a dog... Furthermore, both a broader comparison (for more complex data and processing) and the technique I mentioned seems to be absent. However, I suppose it shows how to reclaim at least some benefits of better data arrangement even if the language isn't a good fit.

Rexxar · on March 28, 2016

Did you speak of this "datadraw.sourceforge.net" ? It's the first time I heard of it. Have you used it ?

jakub_h · on March 28, 2016

For a number of reasons, I'm not particularly fond of this specific implementation but I'm seriously considering porting its principles into something else (where they might work even better).

mikeash · on March 28, 2016

The only reason C++ has increased opportunities for inlining is because C++ often requires code to be placed in header files. If you wrote C in this style, you'd have the same opportunities for inlining, it's just that nobody writes it that way, because we generally prefer separately compiled libraries.

p0nce · on March 28, 2016

I'd even say it's generally faster because you can use integer as template arguments for enforced constant folding. Typically in video work: make a code path specialized when block width is 8, 16, 32 etc. Something that a C compiler can never do unless you use huge macros.

thelema314 · on March 28, 2016

I kind of wish this article had gotten into some details of how various compilers implement std::function; while it was nice to see some details of clang++'s 32-byte size of all std::function to remove the need for dynamic memory for many function objects, I'm still wishing I had more details on efficiency/cost of using std::function vs. plain lambdas or function pointers or other solutions.

Animats · on March 28, 2016

From the article: "My favorite expression in C++ is [](){}();, which declares an empty lambda and immediately executes it."

Sigh.

tomjakubowski · on March 28, 2016

If you'll excuse me while I golf line noise, you can just write it as []{}(), too; the parameter list may be omitted for a lambda that takes no arguments.

igravious · on March 28, 2016

Come on. What's wrong with that? That tickles me pink both aesthetically and logically? Expand on "sigh" sir!

Kristine1975 · on March 28, 2016

Then you'll be delighted at seeing this function:

  template <typename... T>
  void foo(T......);

Too · on March 28, 2016

In JavaScript it's quite common to see self executing lambdas like this (not empty though), because there variables have function scope rather than block scope so it's a way to limit the scope and capture closures. Does it have any practical use in c++?

tunah2 · on March 28, 2016

Lazy initialization is one example:

  void foo() {
    // initialized once when foo is first called
    static auto lazy_data = ...;

    // use lazy_data...
  }

As of c++11, the compiler is required to make this threadsafe (initializer only runs once, even if foo() is called concurrently), typically by inserting locks.

Since this only applies to the initializer, for complex initialization a self-executing lambda can be used:

  void foo() {
    static auto lazy_data = []{
      auto data = new Whatever();
      // initialize data...
      return data;
    }();

    // use lazy_data...
  }

mfukar · on March 28, 2016

Of course; with generic lambdas it's absolutely possible to code in a functional style: https://gist.github.com/mfukar/3214a140a5a3b35c59fa

That includes all the JS pass-a-lambda-as-a-callback shenanigans.

catnaroek · on March 28, 2016

No, you don't need this in C++. You can use a normal (brace-delimited) block for this purpose.

evincarofautumn · on March 28, 2016

The key difference being that a lambda is an expression, while a block is not, unless you use GCC’s statement-expression syntax of “({…})”.

catnaroek · on March 28, 2016

Oh, I forgot that, you're absolutely right. I'm getting too used to languages where everything is an expression. :-)

BudVVeezer · on March 28, 2016

You are forgetting attributes, which can be empty and are grammatical sequences (so you can add them infinitely).

[[]][[]][]()[[]][[]]{}();