Out-of-line Lambdas

Published June 5, 2020

Lambdas are a great tool to make code more expressive. Except when they aren’t.

With C++11 bringing them to the language, we were given the liberating power to create anywhere those little functions embarking bits of context. Sometimes they make our code terse and to the point. But sometimes, they sit in the middle of their call site, exposing their internals for all to see.

To illustrate, consider this piece of code that takes a collections of boxes and filters on those that have the physical characteristics to bear the pressure of a given product:

auto const product = getProduct();

std::vector<Box> goodBoxes;
std::copy_if(boxes.begin(), boxes.end(), std::back_inserter(goodBoxes),
    [product](Box const& box)
    {
        const double volume = box.getVolume();
        const double weight = volume * product.getDensity();
        const double sidesSurface = box.getSidesSurface();
        const double pressure = weight / sidesSurface;
        const double maxPressure = box.getMaterial().getMaxPressure();
        return pressure <= maxPressure;
    });

We don’t want to see this sort of details in the middle of the calling code.

This brings up the question: when should we use an on-the-fly temporary lambda (like the one above), and when should we prefer creating a out-of-line function to relieve the call site, like in this other version of the code:

auto const product = getProduct();

std::vector<Box> goodBoxes;
std::copy_if(boxes.begin(), boxes.end(), std::back_inserter(goodBoxes), resists(product));

In this example, the second solution looks better because the body of the lambda is at a lower level of abstraction than the surrounding code. For more about this, check out the article on expressive lambdas.

This doesn’t mean we should refrain from using a lambda though. The out-of-line function resists can be implemented with a lambda:

auto resists(Product const& product)
{
    return [product](Box const& box)
    {
        const double volume = box.getVolume();
        const double weight = volume * product.getDensity();
        const double sidesSurface = box.getSidesSurface();
        const double pressure = weight / sidesSurface;
        const double maxPressure = box.getMaterial().getMaxPressure();
        return pressure <= maxPressure;
    };
}

If you haven’t seen this technique before, take a moment to read the above code: it is a function (resists) that take a context (product) and returns a function (an unnamed lambda) that captures that product.

The return type is the type of the lambda, and since it is determined by the compiler and unknown to us programmers, we use a convenient auto as a return type of the function.

But the above code has (at least) one issue. Can you see what it is?

The capture of the lambda

One issue in the above code is that the lambda captures by copy:

auto resists(Product const& product)
{
    return [product](Box const& box)
    {
        const double volume = box.getVolume();
        ...

But there is no reason to make a copy here. This lambda gets destroyed at the end of the statement with the std::copy_if, and product stays alive during this time. The lambda could just as well take the product by reference:

auto resists(Product const& product)
{
    return [&product](Box const& box)
    {
        const double volume = box.getVolume();
        const double weight = volume * product.getDensity();
        const double sidesSurface = box.getSidesSurface();
        const double pressure = weight / sidesSurface;
        const double maxPressure = box.getMaterial().getMaxPressure();
        return pressure <= maxPressure;
    };
}

This is equivalent to the previous version that captured by copy, except this code doesn’t make a copy.

This is all good, except this code breaks if we change the call site a little. As a reminder the call site looked like this:

auto const product = getProduct();

std::vector<Box> goodBoxes;
std::copy_if(boxes.begin(), boxes.end(), std::back_inserter(goodBoxes), resists(product));

What if we decide to give a name to our lambda and also get rid of the product intermediary object?

std::vector<Box> goodBoxes;
auto const isAGoodBox = resists(getProduct());
std::copy_if(boxes.begin(), boxes.end(), std::back_inserter(goodBoxes), isAGoodBox);

Then this becomes undefined behaviour. Indeed, the Product returned by getProduct is now a temporary object that gets destroyed at the end of its statement. When isGoodBox is called by std::copy_if, it invokes this product that is already destroyed.

Capturing by reference in resists has made our code brittle.

A warning, sometimes

In most of the cases I tested, this code compiled without any warning. The only case where the compiler emitted a warning was:

with gcc,
with the optimisation level -O1,
and when the temporary was built with a direct call to the constructor (Product{1.2}):

auto const isAGoodBox = resists(Product{1.2});
std::copy_if(boxes.begin(), boxes.end(), std::back_inserter(goodBoxes), isAGoodBox);

In this specific case, the warning was this:

warning: '<anonymous>.Product::density_' is used uninitialized in this function [-Wuninitialized]
     double getDensity() const { return density_; }

This is nice. But in all the other configurations I tested (-O0, -O2, -O3, using an intermediary function getProduct(), or compiling with clang) didn’t produce a warning. Here is the compilable code if you’d like to play around with it.

Generalized lambda capture of lambdas

We can use generalized lambda capture to move the temporary Product into our lambda.

Indeed, C++14 brought in a new feature for lambdas: the generalized lambda capture. It allows to execute some custom code within the capture of the lambda:

[context = f()](MyType const& myParameter){ /* body of the lambda */ }

Let’s take advantage of generalized lambda capture to move the temporary:

auto resists(Product&& product)
{
    return [product = std::move(product)](const Box& box)
    {
        const double volume = box.getVolume();
        const double weight = volume * product.getDensity();
        const double sidesSurface = box.getSidesSurface();
        const double pressure = weight / sidesSurface;
        const double maxPressure = box.getMaterial().getMaxPressure();
        return pressure <= maxPressure;
    };
}

With this modification of the code, after the temporary product (that was moved from) gets destroyed, the lambda carries on its life with its own product. There is no longer undefined behaviour.

But now, we can no longer use the first version of our call site:

auto const product = getProduct();

std::vector<Box> goodBoxes;
std::copy_if(boxes.begin(), boxes.end(), std::back_inserter(goodBoxes), resists(product));

Indeed, product is an lvalue here, and therefore cannot bind to an rvalue reference. To underline this the compiler unceremoniously rejects this code:

error: cannot bind rvalue reference of type 'Product&&' to lvalue of type 'const Product'

We need to make resists compatible with both call sites. Note that this is an analogous idea to the one in Miguel Raggi’s guest post on how to construct C++ objects without making copies.

An overload for each case

One solution is to make two overloads of resists: one that takes an lvalue reference and one that takes an rvalue reference:

auto resists(Product const& product)
{
    return [&product](const Box& box)
    {
        const double volume = box.getVolume();
        const double weight = volume * product.getDensity();
        const double sidesSurface = box.getSidesSurface();
        const double pressure = weight / sidesSurface;
        const double maxPressure = box.getMaterial().getMaxPressure();
        return pressure <= maxPressure;
    };
}

auto resists(Product&& product)
{
    return [product = std::move(product)](const Box& box)
    {
        const double volume = box.getVolume();
        const double weight = volume * product.getDensity();
        const double sidesSurface = box.getSidesSurface();
        const double pressure = weight / sidesSurface;
        const double maxPressure = box.getMaterial().getMaxPressure();
        return pressure <= maxPressure;
    };
}

This creates code duplication, and this is one of the case of technical code duplication that we should avoid. One way to solve this is to factor the business code into a third function called by the other two:

bool resists(Box const& box, Product const& product)
{
    const double volume = box.getVolume();
    const double weight = volume * product.getDensity();
    const double sidesSurface = box.getSidesSurface();
    const double pressure = weight / sidesSurface;
    const double maxPressure = box.getMaterial().getMaxPressure();
    return pressure <= maxPressure;
}

auto resists(Product const& product)
{
    return [&product](const Box& box)
    {
        return resists(box, product);
    };
}

auto resists(Product&& product)
{
    return [product = std::move(product)](const Box& box)
    {
        return resists(box, product);
    };
}

A generic solution

The advantages of this solution is that it allows for expressive code at call site by hiding lower-level details, and that its works correctly for both lvalues and rvalues.

One drawback is that it creates boilerplate with the multiple overloads of the lambda.

What is your opinion on this? Mine is that the advantages outweigh the drawback, however it would be interesting to mitigate the drawback. One way would be to create a generic component to encapsulate the mechanism of the multiple overloads. We would use this generic component instead of writing the boilerplate every time. This is what we will discuss in a future post.

You will also like

Don't want to miss out ? Follow:
Share this post!

About Jonathan Boccara