Jonathan Boccara's blog

A Case Where Using Auto Leads to Undefined Behaviour

Published October 30, 2018 - 12 Comments

C++11’s feature auto has changed the looks of C++ code. In a lot of cases, auto alleviates code from burdening information, and using it does make code simpler. So much so that using auto becomes a second nature to make code more expressive.

Should we use auto always? According to Herb Sutter guideline for C++11, yes, almost always (which has now been updated to Always in C++17).

Almost always.

Today we’ll see a case where you don’t want to use auto, because it causes undefined behaviour, in a way that is not immediate to spot (and I haven’t found thus bug described elsewhere, please point me to an existing resource if I’m wrong).

I’m not arguing against auto in the general case though, I think it does makes code flow better. But if you encounter the following case, it should save you some time to know that you shouldn’t use auto there.

The case

We have a collection of bools, idiomatically stored in a std::deque<bool> (the fact that this is idiomatic is not that glorious, but anyway) that you can think of as representing a binary number. The first elements are the most significant digits, and the last ones are the least significant digits.

We would like to do a ‘+1’ on this “binary number”, that is to say to produce the collection of bools that corresponds to that binary number + 1. To do this, we work our way up from the back of the collection, flip the current bit, and stop when it flips to 1.

For logging purposes, we print the value of the examined bit along with its position in the collection:

void flip(bool& bit)
{
    bit = !bit;
}

void increment(std::deque<bool>& bits)
{
    if (bits.empty()) return;
    if (bits.size() == 1)
    {
        flip(bits.back());
    }
    
    for (auto bitIndex = bits.size() - 1; bitIndex >= 0; --bitIndex)
    {
        auto& bit = bits[bitIndex];
        
        std::cout << "bitIndex=" << bitIndex << " value= " << bit << '\n';

        flip(bit);
        if (bit == true)
        {
            break;
        }
    }
}

If we test it by incrementing a binary number enough times so that it loops back to 0:

int main()
{
    auto number = std::deque<bool>(3);
    
    increment(number);
    increment(number);
    increment(number);
    increment(number);
    increment(number);
    increment(number);
    increment(number);
    increment(number);
}

Then the program…crashes.

Can you see why? Hint: it is because of one of the autos of the code, that isn’t doing what we’d naively expect. If you’d like to play around with the code, here is the code where the crash occurs.

The next section explains the cause of the problem, so if you’d like to think about it on your own first, maybe wait a minute before scrolling down the page.

 

 

 

One auto too far?

Done searching? The culprit is the auto in the initialization of the for loop:

void flip(bool& bit)
{
    bit = !bit;
}

void increment(std::deque<bool>& bits)
{
    if (bits.empty()) return;
    if (bits.size() == 1)
    {
        flip(bits.back());
    }
    
    for (auto bitIndex = bits.size() - 1; bitIndex >= 0; --bitIndex)
    {
        auto& bit = bits[bitIndex];
        
        std::cout << "bitIndex=" << bitIndex << " value= " << bit << '\n';

        flip(bit);
        if (bit == true)
        {
            break;
        }
    }
}

Indeed, this auto defines bitIndex to be of the type of bits.size() - 1, which is itself the type of bits.size(). Which in practice is often of type size_t, which is unsigned.

So bitIndex is unsigned. So if we pass in 1 1 1 to increment, the for loop works its way from the back and all the way up to the beginning of the collection. bitIndex is then 0. The for loop performs an ultimate --bitIndex, which looks like it sets bitIndex to -1 and make the loop stop, but there is no such thing as -1 in the world of the unsigned.

Therefore, --bitIndex sets --bitIndex to a very, very high integral number (the highest possible unsigned number, like the mind-bogglingly high 18446744073709551615 on the implementation I tested), which is greater than 0, so the loops rolls on! It then tries to access an element of the collection that is way, way past its end (and even way past the end of your RAM, and the room that your computer is sitting in).

This causes undefined behaviour, which occurs in the form of a seg fault in this case. I’ve tried an analogous use case using std::vector instead of std::deque (therefore, not on booleans), and the program didn’t crash. Instead, it displayed the very big numbers. But this is still standard C++ since undefined behaviour can be anything, by definition.

To fix the issue, we can simply replace this auto with int, because this is really what we want here:

void increment(std::deque<bool>& bits)
{
    if (bits.empty()) return;
    if (bits.size() == 1)
    {
        flip(bits.back());
    }
    
    for (int bitIndex = bits.size() - 1; bitIndex >= 0; --bitIndex)
    {
        auto& bit = bits[bitIndex];
        
        std::cout << "bitIndex=" << bitIndex << " value= " << bit << '\n';

        flip(bit);
        if (bit == true)
        {
            break;
        }
    }
}

Shouldn’t we avoid for loops in the first place?

The point here was to illustrate this risk with auto. But going slightly off-topic, was this code well-designed in the first place? We know that we should try to avoid for loops and that using STL algorithms make code more robust and expressive, right?

There is one thing that makes using algorithms difficult here: we’re accessing the position of the current element in the collection here (bitIndex). And STL algorithms don’t play well with positions. There are techniques to work around using a raw loop for that though, that we see in a dedicated article (see How to Access the Index of the Current Element in a For Loop), but it requires to write a bit of specific code for that.

If we didn’t have to access the position of the current element, there is a quick fix we could do to begin with: using reverse iterators instead of indexes:

void increment(std::deque<bool>& bits)
{
    if (bits.empty()) return;
    if (bits.size() == 1)
    {
        flip(bits.front());
    }
    
    for (auto bit = rbegin(bits); bit != rend(bits); ++bit)
    {
        flip(*bit);
        if (*bit == true)
        {
            break;
        }
    }
}

And using auto is fine now, because it resolves to an iterator type, no longer an unsigned number.

But a better solution would be to go all the way with STL algorithms! Which is off-topic for this post on auto, but right on topic for a future post.

Stay tuned!

You may also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed