Jonathan Boccara's blog

A smart iterator for aggregating new elements with existing ones in a map or a set

Published March 21, 2017 - 4 Comments

One thing that is cruelly lacking with std::inserter is that it can do just this: inserting. In some situations this is not enough, in particular for a map: what if an element with the same key is already there? std::inserter, since it calls std::map::insert, will not do anything at all in this case. But maybe we would like to replace the current element with the new one? Or maybe a more complex aggregation behaviour is needed, like adding the values together for example? This last case has been encountered in the project of Coarse Grain Automatic Differentiation when composing derivatives to multiple variables.

This post is part of a series on smart iterators in sorted containers:

Said differently, we need a even smarter iterator, to which you could describe what to do when trying to insert elements with keys already present in the map. The most generic expression I found was to provide an aggregator, that is, a function that describes how to merge two values, for elements having the same key. This would let new elements being “inserted” into the map regardless of whether or not their key is already present, and still keep the unicity of the key in the map (so this solution is effectively different from using a multimap).

Here is how map_aggregator could be implemented:

template<typename Map, typename Function>
class map_aggregate_iterator : public std::iterator<std::output_iterator_tag, void, void, void, void>
{
public:
    map_aggregate_iterator(Map& map, Function aggregator) : map_(map), aggregator_(aggregator) {}
    map_aggregate_iterator operator++(){ return *this; }
    map_aggregate_iterator operator*(){ return *this; }
    template<typename KeyValue>
    map_aggregate_iterator& operator=(KeyValue const& keyValue)
    {
        auto position = map_.find(keyValue.first);
        if (position != map_.end())
        {
            position->second = aggregator_(position->second, keyValue.second);
        }
        else
        {
            map_.insert(position, keyValue);
        }
        return *this;
    }
    
private:
    Map& map_;
    Function aggregator_;
};

Here is a helper function to instantiate it and deduce template parameters:

template<typename Map, typename Function>
map_aggregate_iterator<Map, Function> map_aggregator(Map& map, Function aggregator)
{
    return map_aggregate_iterator<Map, Function>(map, aggregator);
}

This has several major differences with std::inserter:

  • map_aggregator embarks a aggregator function in its constructor,
  • operator=  aggregates the new value into the existing element by using the aggregator function, if the key is already present in the collection.
  • Like sorted_inserter presented in the previous post of this series, you don’t have to pass a hint. (In fact you could pass it if you knew it, but to alleviate the code in this post I am not showing this functionality here.)

Here is a how map_aggregator can be used:

std::vector<std::pair<int, std::string>> entries = { {1, "a"}, {2, "b"}, {3, "c"}, {4, "d"} };
std::vector<std::pair<int, std::string>> entries2 = { {2, "b"}, {3, "c"}, {4, "d"}, {5, "e"} };
std::map<int, std::string> results;

std::copy(entries.begin(), entries.end(), map_aggregator(results, concatenateStrings));
std::copy(entries2.begin(), entries2.end(), map_aggregator(results, concatenateStrings));

// results contains { {1, "a"}, {2, "bb"}, {3, "cc"}, {4, "dd"}, {5, "e"} };

Here the first call to map_aggregator is not strictly necessary, since the collection results is empty. It could be replaced by an simple std::inserter or, more to the point, by a sorted_inserter presented in the first post of this series.

What about sets?

The above aggregator has been designed to work with maps, that contain pairs of keys and values. But sometimes the key of an element is embedded inside the element, like with a reference number that is a member of an object for example. In this case you may want to use a set with a customized comparison based on the subpart of the element that represents the key.

We can then define another smart iterator for aggregating into a set, with much the same logic as the one for maps, the main difference lying in the operator=:

set_aggregate_iterator& operator=(Value const& value)
{
    auto position = set_.find(value);
    if (position != set_.end())
    {
        auto containedValue = *position;
        position = set_.erase(position);
        set_.insert(position, aggregator_(value, containedValue));
    }
    else
    {
        set_.insert(position, value);
    }
    return *this;
}

The thing with sets is that they don’t allow their values to be modified (some platforms let you get away with it but relying on that prevents code from being portable). Therefore, we have to remove the old value and then add the aggregated one. This is what operator= does here when it finds that the element was there already.

For clarity, the rest of the implementation of this inserter in sets is omitted in the writing of this post, but it’s essentially the same as the one for maps.

To see the entire code of the components presented here, you can head over to the dedicated GitHub repository.

 

Over to you

Do you find these components of this series useful? Have you encountered the problems they are solving? How would you have gone about solving them differently?

Whether you are a new reader on Fluent C++ or a regular one, your feedback matters to me. And not only on this particular series by the way. Depending on the size and visibility you want your feedback to have you can drop a comment below, or use email or Twitter to get in touch directly. Hoping to hear from you!

Related articles:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed