Jonathan Boccara's blog

Move iterators: where the STL meets Move semantics

Published April 25, 2017 - 3 Comments

Daily C++

In C++11, a host of new features were introduced in the language and the Standard Library, and some of them work in synergy. Move iterators are an example of how the STL collaborates with move semantics, to allow expressing several important concepts in a very well integrated piece of code.

Well, almost. By using the native features only, we don’t get the most out of this combinations of concepts. But by throwing in a pinch of ranges (which are expected to be the next version of the language, and are already available in Eric Niebler’s ibrary) the picture really clears up to show an impressive expressiveness in C++ code.

Prerequisites about move semantics in C++

To understand move iterators, you need to understand move semantics first. If you’re already familiar with these then you can safely skip over to the next section. Otherwise, here I am presenting just enough about move semantics in order to understand move iterators.

Before move semantics appeared, there was only one convention in C++ to instantiate an object from another object of the same type, and it was by making a copy out of it:

class MyType
{
public:
    MyType(MyType const& otherObject) // copy constructor
    {
        // code that performs the copy of
        // otherObject into this object
    }
    ...

Note that the source object (the otherObject parameter) is const. It makes sense because to make a copy, the source object is just used as a model and does not need to be modified.

The concept of copying is absolutely fine, and widely used.

Except when the source object won’t be used again, in which case making a copy is not the best solution.  And if, for some reason, the transfer of data could be made faster by modifying the source then it would be useful to take advantage of it.

It turns out that modifying the source object sometimes allows for a faster data transfer. An std::string for example typically stores its characters into a dynamically allocated array (if the string is too long to use the small string optimization, that is). And for the string that is being constructed, it is much faster to take ownership of the array of the source string rather than allocating its own array, like it would do in a copy.

And to flag objects as “disposable”, C++11 introduces r-value references, tagged with &&:

class MyType
{
public:
    MyType(MyType && otherObject) // move constructor - note the absence of const
    {
        // code that performs a fast transfer
        // of data but may modify otherObject
    }
    ...

r-value references can be created either automatically by the language, like on the temporary object returned by value from a function. Or they can be created by an explicit action from the developer, by using std::move:

std::string s;
std::string sByCopy = s; // calls the copy constructor
std::string sByMove = std::move(s); // call the move constructor

std::move does a cast into r-value reference. Indeed, as explained in Item 23 of Scott Meyers’s Effective Modern C++, std::move doesn’t actually move anything, but rather it orients the execution towards the move constructor by casting the source object into an r-value reference.

Note that all that we have seen on constructor also works for the assignment operator (operator=), for objects that have already been constructed.

The move iterator

The purpose of the move iterator

The purpose of the move iterator is to allow the STL to move the objects it manipulates, instead of copying them.

Indeed, the STL makes copies by default. In the following example:

std::vector<std::string> source = { "Move", "iterators", "in", "C++" };
std::vector<std::string> destination(begin(source), end(source));

…displaying (*) the contents of the container at the end of  this code outputs:

Source contains: "Move" "iterators" "in" "C++"
Destination contains: "Move" "iterators" "in" "C++"

destination contains copies of the elements of source. (For more about this type of container construction, read Inserting several elements into an STL container efficiently.)

Using move iterators (which we’ll see in just a moment) would rather lead to the following output:

Source contains: "" "" "" ""
Destination contains: "Move" "iterators" "in" "C++"

where each string is still present in the container, but with its contents moved away from it.

move iterators

Note that it doesn’t do the same thing as std::move on the vector:

std::vector<std::string> destination = std::move(source);

which moves the whole vector:

Source contains:
Destination contains: "Move" "iterators" "in" "C++"

How to use the move iterator

The move iterator wraps another iterator, and returns an r-value reference of what the wrapped iterator returns when it is dereferenced.

When dereferenced (with * or ->), STL containers (like vectors) iterators return a reference to the element they point to. Dereferencing a move iterator has the equivalent effect of calling std::move on the reference returned by the wrapped iterator, to convert it into an r-value reference.

Let’s illustrate with an example. std::move_iterator is itself a class template whose template parameter is the type of the iterator that it wraps. To avoid writing out the template types in calling code, std::make_move_iterator will make the type deduction for you:

std::vector<std::string> source = { "Move", "iterators", "in", "C++" };
std::vector<std::string> destination(std::make_move_iterator(begin(source)),
                                     std::make_move_iterator(end(source)));

outputs:

Source: "" "" "" "" 
Destination: "Move" "iterators" "in" "C++"

Each element is still present in the source vector, but its contents has been moved away from it.

Getting deeper into the subject, let’s now observe that move iterators have two issues:

  • they can lose your data if they are slightly misused
  • they make a hell of a lot of code for expressing a simple thing

Don’t shoot your data in the foot

(Admittedly, it doesn’t make sense to shoot someone in the foot. But since shooting oneself in the foot has become such a wildly used expression to designate misuses of C++ features leading to bugs, please allow me to use this idiom in that sense 🙂 )

There is a way to lose your data when using move iterators. The idea is that if the elements in source are moved to some other place than destination, then in the end they are neither in source nor in destination so they are effectively lost.

Let’s see an example:

std::vector<std::string> source = { "Move", "iterators", "in", "C++" };
std::vector<std::string> destination;

std::copy_if(std::make_move_iterator(begin(source)),
             std::make_move_iterator(end(source)),
             std::back_inserter(destination),
             [](std::string const& word){ return word.length() == 4; });

std::copy_if is an STL algorithm that iterates over the source collection and copies the elements that satisfy a predicate over to the destination. But here we use move iterators, so the input of the algorithm become r-value references.

In your opinion, will the source elements be moved inside the predicate? If they are, they will be lost since the predicate won’t give them back. Take a moment to think about this and click to find out what the above code outputs:

Source: "" "iterators" "in" "C++" 
Destination: "Move"

Here the copy_if on move iterators has transformed into a sort of “move_if”, which kind of makes sense. At least no data has been lost.

And the reason why the data was not lost is because it wasn’t moved into the predicate in the first place: since the predicate takes a reference, no object was move-constructed (nor constructed at all) in the predicate.

But what if we change the signature of the predicate (look on the last line), by taking the elements by value instead of by reference to const?

std::vector<std::string> source = { "Move", "iterators", "in", "C++" };
std::vector<std::string> destination;

std::copy_if(std::make_move_iterator(begin(source)),
             std::make_move_iterator(end(source)),
             std::back_inserter(destination),
             [](std::string word){ return word.length() == 4; });

What do you think the output will be this time? Click and check if you got it right:

Source: "" "" "" "" 
Destination: ""

All the data has been lost! This because the predicate moves in the elements, and doesn’t give them back.

In summary, you want to be careful about this sort of problems when using the move iterator.

One step further with ranges

The usages of move iterators that we saw produce a lot of code to express a very simple thing, which is using an r-value reference of the elements instead of the elements themselves. So it is legitimate to expect a simple code to express it, isn’t it?

What makes the code verbose here is that it is too low in terms of levels of abstraction. And as we often come across it, good code mostly comes down to respecting levels of abstraction. One way to raise the levels of abstraction of iterators is to encapsulate them into a range. (If you want to know more about ranges, have a look at Ranges: the STL to the Next Level.)

The range-v3 library, which is the basis for the Standard proposal on ranges, includes a move view, which does exactly what move iterators aim at doing, but can be expressed in a much more simple way:

source | view::move;

This resulting range can be used in an algorithm and will then move the source elements when queried. But be careful that it won’t prevent from losing the data in the wrong situations as shown above.

Related articles:

(*) Here is the display code:

std::cout << "Source: ";
for (auto const& w : source) std::cout << '"' << w << '"' << ' ';
std::cout << "\nDestination: ";
for (auto const& w : destination) std::cout << '"' << w << '"' << ' ';
Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed