Multiple error handling with the optional monad in C++

Published July 4, 2017 - 16 Comments

Error handling is a vast topic in programming, and in C++ in particular. The aspect I would like to cover with you today is how to deal with multiple errors.

Let’s consider the following 4 functions:

int f1(int a);
int f2(int b, int c);
int f3(int d);
int f4(int e);

These functions should be called in turn: the result of f1 (called twice) is passed to f2, then the result of f2 is passed to f3, and so on. So far, so good.

Now let’s say that each of them may fail. That is to say that they normally returns ints, but in some cases they just cannot build this value to return. This actually makes sense in real life functions. sqrt won’t know what to do if you pass a negative number to it. std::stoi won’t be able to return an int if the string passed to it does not represent one. These two examples have been taken from the standard, but this happens in user code too. Sometimes, a function is just not able to return a result.

It’s a fairly simple problem, but there are several solutions. We have seen in details how to make the interface of one given function clearer by using optional<T>. Now let’s see how to handle errors where several functions are called in a row, and each may fail.

Multiple error handling of the Ancient

Buried deep in the roots of C++ lie its functions coming from C. One way to deal with multiple error handling is by keeping an error (or a success) status in a flag.

For this let’s modify the interfaces of our functions:

bool f1(int a, int& result);
bool f2(int b, int c, int& result);
bool f3(int d, int& result);
bool f4(int e, int& result);

We have to agree that all functions return a flag that means… say a success.

The call site looks like:

bool success = true;
int b1 = 0;
int b2 = 0;
int c = 0;
int d = 0;
int result = 0;

success &= f1(3, b1);
success &= f1(4, b2);
success &= f2(b1, b2, c);
success &= f3(c, d);
success &= f4(d, result);

if (success)
{
    // we can use result
}
else
{
    // we know that something went wrong
}

This is ok… when you’re used to C. But this is definitely not cool in C++.

The main problem here is that as we’ve seen in an previous post, functions should provide their output by their return type. This makes for much clearer and natural code.

Other problems with this solution include that we are forced to declare all the variables (preferably with a default value) before the action happens, and that the bools coming out of the functions don’t really say if they mean error or success.

So this is not the way to go. But I think it was worth seeing this example, because this can be found in production code.

Just Throw An Exception

A more modern way to proceed is for the functions to just throw their arms in the air, and an exception with them.

This way, the original interfaces remain untouched. If a function succeeds, its provides an int. If it doesn’t, you’re out of here and the stack is wound up until a catch is encountered. This way we know when the code has succeeded, and the initial interfaces of the functions do not have to change.

Unfortunately, throwing exceptions is not that simple, and has consequences. One is a performance consideration. And another important thing is that the code surrounding the site an exception was thrown has to enforce certain properties, collectively called exception safety. It is not something that happens by chance, and not all the code out there is exception safe, far from it. And this is not the topic of this post. Let’s explore other ways to deal with multiple error handling.

Our dear friend `optional<T>`

Actually we’ve been through such considerations for improving the expressiveness of error handling for one function, by using optional. You can read all about it in this post.

So let’s change our functions’ interfaces to return an optional:

#include <boost/optional.hpp>

boost::optional<int> f1(int a);
boost::optional<int> f2(int b, int c);
boost::optional<int> f3(int d);
boost::optional<int> f4(int e);

I am purposefully using boost optional here, because at the time of this writing it is much more wildly available than std::optional of C++17. But all that follows applies also to std::optional, for which you can just replace boost with std and none by nullopt.

Now the question is, how do optional compose? The answer is: badly.

Indeed, each optional can be checked in an if statement (it has a conversion to bool) to determine whether or not the function has succeeded. This gives the following code:

boost::optional<int> result;

boost::optional<int> b = f(3);
if (b)
{
    boost::optional<int> c = f(4);
    if (c)
    {
        boost::optional<int> d = g(*b, *c);
        if (d)
        {
            boost::optional<int> e = h(*d);
            if (e)
            {
                result = h(*e);
            }
        }
    }
}

if (result)
{
    // we can use *result
}
else
{
    // we know that something went wrong
}

These if statements nested into each other are typically what can be seen in code using several optionals in the same routine. And this feels wrong. Indeed, you can feel there is too much code, right?

What we want to do can be simply said though: continue the calculation until one function fails by returning an empty optional. But the above code looks like it is a level of abstraction too low, as it shows all the mechanics in place to implement this.

But isn’t there a way to encapsulate the if statements?

The optional monad in C++

It turns out this can be achieved by using an idea coming from functional programming, called a monad. This is used intensively in languages such as Haskell.

First off, let me make one thing clear: I’m not going to even try to explain what a monad is. Indeed, monads can’t seem to explained simply (more about this in the famous “Monad Tutorial Fallacy” article.)

There seems to be two kinds of people: those who understand monads, and those who don’t understand them yet. And there is no possible communication between the two. So as soon as you understand monads you lose all ability to explain them simply to someone. And to be honest, I’m not really sure in which part I belong, which make the situation even more confusing for me.

The good news is, you don’t need to know Haskell nor have a firm grasp on monads to understand what follows. I want to show you a very practical, C++ oriented way to deal with multiple optional<T>, inspired from monads. I discovered this in an excellent talk from David Sankel given at C++Now 2016.

The idea is to write a function able to combine an optional<T> with a function taking an T and returning and optional<U>. Indeed, this correspond to our case, with T and U being int.

Say the optional<T> is called t, and the function f, the body of this function is quite simple to write:

if (t)
{
    return f(*t);
}
else
{
    return boost::none;
}

This is where the if statement gets encapsulated.

Now the prototype of this function needs two considerations:

we make it an operator, rather than a function. As you will see in a moment, this makes for a nicer syntax when chaining up the calls to the various functions. We choose operator>>=, (Some use operator>>, but I propose this one because it cannot conflict with a stream operator templated on the stream, and also because it happens to be the one used in Haskell).

the function has to be compatible with any callable type (functions, function pointers, std::function, lambdas or other function objects). For this, the only way I know is to use a template parameter. Some use an std::function but I don’t know how they manage to pass a lambda to it.

Here is the resulting prototype:

template<typename T, typename TtoOptionalU>
auto operator>>=(boost::optional<T> const& t, TtoOptionalU f) -> decltype(f(*t))

To use it we combine the optional<int> (which represents the optional<T>) returned by each function with a lambda taking an int. This int represent the T in TtoOptionalU. What happens is that if this optional is empty, the operator>>= just returns an empty optional. Otherwise it applies the next function to the value in the optional:

boost::optional<int> result = f(3) >>= [=](int b)     // b is the result of f(3) if it succeeds
                     { return f(4) >>= [=](int c)     // c is the result of f(4) if it succeeds
                     { return g2(b, c) >>= [=](int d) // and so on
                     { return h(d) >>= [=](int e)
                     { return h(e);
                     };};};};

Maybe you will like it better with a different indentation:

boost::optional<int> result3 = f(3) >>= [=](int b) { return
                               f(4) >>= [=](int c) { return
                               g2(b, c) >>= [=](int d) { return
                               h(d) >>= [=](int e) { return
                               h(e);
                      };};};};

Compare this code with the initial trial with optionals. The if statements are gone.

But an unusual syntax has appeared. And the technology is way more complex that the old C style version. Is this ok? If you have some experience with functional programming then you will have an easier time finding this natural. Otherwise you have to decide if the declarative style is worth it.

But whether you find this a viable option or not, I think it’s worth understanding it, because it illustrates a different programming paradigm.

To be really fair, I have to point out that if one of these functions does not return an optional but directly an int, then you have to wrap its result into an optional. Because operator>>= only expects optionals. On the other side, such a function would not need an if in the initial example using optional.

If you understood all the bits, but find that you can’t wrap your head around the global concept, it is quite all right. This is not easy. Just have a closer look at the last example, maybe try to write it yourself, and this should become clearer and clearer.

In the next post, we see a more elaborate implementation using modern C++ features, and that leads to much cleaner calling code.