Jonathan Boccara's blog

A smart iterator for inserting into a sorted container in C++

Published March 17, 2017 - 5 Comments

Smart iterators add great potential to writing expressive code with the STL in C++. And the ones that are proposed natively work particularly well with vectors and with other sequence containers such as deque, list and string.

But the situation is not as good for associative containers, such as maps and sets (or their flat- non-standard counterparts). Indeed, using the native smart iterators is cumbersome and lacks some functionalities. In this 2-post series, I want to propose additions that aim at fixing this situation and letting us write more expressive code when adding elements to an associative container, which is a operation encountered quite frequently in day-to-day code. Of course, your feedback would be very important in the whole process.

To get a grasp of how smart iterators work with the STL, we start by examining std::back_inserter, one of those that work well with vectors (if you already know it then you may want to skip the first section, although its case its examined in meticulous details). Then we move on to maps and sets, describe a quick state of the existing standard components, and propose new ones to write expressive code more conveniently.

This series contains:

Appending elements to a vector

std::back_inserter generates an output iterator that binds to a container, and does a push_back into this container  every time it is assigned to. This relieves the programmer from the sizing of the output.

Here is an example of how std::back_inserter can be used:

std::vector<int> v = { 1, 2, 3, 4, 5 };
std::vector<int> results;

std::copy(begin(v), end(v), std::back_inserter(results));

Here the algorithm std::copy assigns elements from v to the result of dereferencing the iterator passed via the back_inserter. But std::back_inserter generates an iterator that does more than just dereferencing: when you assign through it, it calls a push_back on results, passing on the elements of v one after one. So that you don’t have to worry about results being big enough in advance. Smart, right?

We would stop here if it was just about using std::back_inserter, but the purpose of this post is to write new smart output iterators. So let’s dissect std::back_inserter to see what it has in the guts.

First, note that it is not itself an iterator, but rather a function that generates an iterator of type std::back_insert_iterator. Since std::back_insert_iterator is a template class (templated on the Container), we need a function template to generate it in order to deduce template arguments, otherwise we would have to write them out explicitly at call site (this constraint should be removed in C++17 with template argument deduction for class constructors):

template<typename Container>
std::back_insert_iterator<Container> back_inserter(Container& c);

So the question is, how does std::back_inserter_iterator work? Here is an excerpt of the class where the central thing happens:

back_insert_iterator<Container>& operator* () { return *this; }
back_insert_iterator<Container>& operator++ () { return *this; }

back_insert_iterator<Container>& operator= (const typename Container::value_type& value)
{
    container->push_back(value);
    return *this;
}

The iterator binds itself to the container at construction, and dereferencing and advancing do essentially nothing but returning the iterator itself. This has the advantage that the iterator keeps control over operator=, to call a push_back on the container.

Adding data to a map

There is a counterpart to std::back_inserter to add elements to an std::map (or an std::set): it is std::inserter. Indeed back_inserter cannot be used on a map or a set because they don’t have a push_back method. This makes sense: since they guarantee to keep their elements sorted, you can’t just decide to puts new elements at the end. So associative containers provide an insert method, and std::inserter does pretty much the same thing as std::back_inserter, except is calls the insert method instead of push_back.

But std::inserter shows two flaws when used with maps: it is cumbersome, and it lacks functionality.

Improving usability with sorted_inserter

First, the usability problem: std::inserter forces you to give a position where an element should be inserted:

template<typename Container>
std::insert_iterator<Container> inserter(Container& c, typename Container::iterator position);

This is all well for a vector, where you have to decide for a position. Indeed it could make sense to insert an element anywhere in a vector. But one of the purposes of a map is to be sorted, so the map should take care of deciding where to position a new element, so that it remains sorted! It is certainly not the programmer’s job to decide this.

Well, if you happened to know where the new element should be put, then you could save this amount of work to the map, by providing a hint. This is why the insert method of a map has several overloads, including one with a hint parameter:

std::pair<iterator,bool> insert(const value_type& value);
iterator                 insert(iterator hint, const value_type& value);

But whether or not you provide a hint should be left to the choice of the programmer.

And std::inserter forces you to provide a hint. But sometimes you don’t have a clue. Imagine that you want to add the contents of an unsorted vector into a set. Then you don’t have one position where all the elements should go. And we find ourselves passing some arbitrary “hint” because the inserter iterator forces us to, tipically the begin or the end of the set, thus cluttering the code with irrelevant information. Note the unnecessary results.end() in he following example:

std::vector<int> v = {1, 3, -4, 2, 7, 10, 8};
std::set<int> results;

std::copy(begin(v), end(v), std::inserter(results, end(results)));

One solution to fix this is to create a new smart iterator that does essentially the same thing as std::inserter, but that does not force its users to provide a hint. Let’s call this sorted_inserter.

template <typename Container>
class sorted_insert_iterator : public std::iterator<std::output_iterator_tag,void,void,void,void>
{
protected:
  Container* container_;
  boost::optional<typename Container::iterator> hint_;

public:
  typedef Container container_type;
  explicit sorted_insert_iterator (Container& container)
    : container_(&container), hint_(boost::none) {}
  sorted_insert_iterator (Container& container, typename Container::iterator hint)
    : container_(&container), hint_(hint) {}
  sorted_insert_iterator<Container>& operator= (const typename Container::value_type& value)
    {
        if (hint_)
            container_->insert(*hint_,value);
        else
            container_->insert(value);
        return *this;
    }
  sorted_insert_iterator<Container>& operator* () { return *this; }
  sorted_insert_iterator<Container>& operator++ () { return *this; }
  sorted_insert_iterator<Container> operator++ (int) { return *this; }
};

This iterator can be instantiated with helper functions for deducing template parameters:

template <typename Container>
sorted_insert_iterator<Container> sorted_inserter(Container& container)
{
    return sorted_insert_iterator<Container>(container);
}

template <typename Container>
sorted_insert_iterator<Container> sorted_inserter(Container& container, typename Container::iterator hint)
{
    return sorted_insert_iterator<Container>(container, hint);
}

The main difference with std::inserter is that the hint is not mandatory. This is easily modeled by using an optional (from boost for the moment, from std in C++17). If the hint is provided then we use it, otherwise we let the container decide how to position the inserted element. Note that the operator= taking an r-value reference has been omitted for clarity in this post, but we write by simply replacing the usages of value by std::move(value).

Here is how sorted_inserter would be used in the above example:

std::vector<int> v = {1, 3, -4, 2, 7, 10, 8};
std::set<int> results;

std::copy(begin(v), end(v), sorted_inserter(results));

The code for sorted_inserter is available on GitHub.

I yet have to benchmark the performance of std::inserter versus sorted_inserter, to measure whether passing a wrong hint is better or worse than passing none at all. This will likely be the topic of a dedicated post.

This iterator would let you insert new elements in a sorted container. But what if the element you are trying to insert is already present in the container? The default behaviour in the STL is to not do anything. But what if you wanted to aggregate the new element with the one already in place? This is the topic of the next post in this series.

Related articles:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed