Jonathan Boccara's blog

Which One Is Better: Map of Vectors, or Multimap?

Published April 10, 2018 - 11 Comments

While advising on how to make code more expressive on the SFME project, I came across an interesting case of choosing the right data structure, which I’ll share with you with the permission of the authors of the projects.

We had to associate a key with several values, and perform various operations. Should we use a map of vectors, or is a multimap more appropriate? Let’s see the case in more details, and compare the two solutions.

The case: an event mediator

map vector multimap

The interface for this event system has three functions:

1- void subscribe(EventReceiver const& receiver, EventID eventID)
This is the method to register a receiver to a certain type of event. When this type of event occurs, the event manager notifies the EventReceiver on its (virtual) method reactTo.

2- void emit(Event const& event) const
This method gets called by the sender of an event when an event occurs. The method calls the reactTo method of all the clients that registered for its event ID.

3- bool isRegistered(EventReceiver const& receiver) const
At any time, we can query the event manager to know whether a given EventReceiver has subscribed to the it (on any event).

(Note that this is a simplified version of the spec for SFME, so that we can focus on the data structure without spending more time understanding the rest of the components).

Given that specification, what data structure should the event manager use to represent the event IDs and the receivers?

It sounds natural to somehow associate event IDs with receivers, by using a map. But we can’t just use std::map<EventID, Receiver const*>, because an event ID can have more than one receiver.

We’re going to explore two alternative designs and see which one fits most for our event manager:

  • a map of vectors: std::map<EventID, std::vector<EventReceiver const*>>
  • a multimap: std::multimap<EventID, EventReceiver const*>

Design 1: A map of vectors

This is probably the most natural design: each event ID can have several receivers, so we map an event ID to a collection of receivers:

class EventMediator
{
public:
    void subscribe(EventReceiver const& receiver, EventID eventID);
    void emit(Event const& event) const;
    bool isRegistered(EventReceiver const& receiver) const;

private:
    std::map<EventID, std::vector<EventReceiver const*>> receiversRegistry_;
};

How would the code of the event manager’s methods look like with that representation? Let’s see the implementation of the three methods: subscribe, emit and isRegistered.

subscribe

The subscribe method finds the entry of the map that corresponds to the event ID, and adds a receiver to the corresponding vector or receivers:

void EventMediator::subscribe(EventReceiver const& receiver, EventID eventID)
{
    receiversRegistry_[eventID].push_back(&receiver);
}

Simple enough.

emit

The emit method picks out the collection of receivers that correspond to the event ID of the event happening, and invoke them all on their reactTo method:

void EventMediator::emit(Event const& event) const
{
    auto eventID = event.getEventID();
    auto const& receivers = receiversRegistry_[eventID];
    for (auto const& receiver : receivers)
    {
        receiver.reactTo(event);
    }
}

Simple too. But this time, the code doesn’t compile and triggers the following error:

error: no viable overloaded operator[] for type 'const std::map<EventID, std::vector<const EventReceiver *> >'

Behind its rough shell, what this error message is trying to tell us is that we want emit to be a const method, but operator[] is not  const on the map. Indeed, if the map doesn’t have an entry corresponding to the queried event ID, operator[] will insert it for us and return a reference to it.

The code to fix the method is less pleasant to the eye:

void EventMediator::emit(Event const& event) const
{
    auto eventID = event.getEventID();
    auto receiversEntry = receiversRegistry_.find(eventID);
    if (receiversEntry != end(receiversRegistry_))
    {
        auto const& receivers = receiversEntry->second;
        for (auto const& receiver : receivers)
        {
            receiver->reactTo(event);
        }
    }
}

It consists of searching for the event ID, and if we find it in the map then we iterate over the corresponding collection. Note that the nestedness of this piece of code reflects the nestedness of a vector inside a map.

isRegistered

The isRegistered method checks if a receiver is registered somewhere in the event manager. Since the map isn’t sorted by receivers but only by event IDs (because that’s its key), we need to perform a linear search across the whole structure: check the first vector, then the second, and so on:

bool EventMediator::isRegistered(EventReceiver const& searchedReceiver) const
{
    for (auto const& receiversEntry : receiversRegistry_)
    {
        auto const& receievers = receiversEntry.second;
        for (auto const& receiver : receievers)
        {
            if (receiver == &searchedReceiver)
            {
                return true;
            }
        }
    }
    return false;
}

Here also, the fact that the data structure is nested leads to a nested code.

The implementation of subscribe is fine, but those of emit and isRegistered could use some simplification, in particular by making them less nested and more straightforward.

Let’s flatten out our data structure by using a multimap instead of a map of vectors.

Design 2: a multimap

A multimap?

What is a multimap, to begin with? It’s like a map, except that a map can only have at most one entry for each key, whereas a multimap can have several entries with equivalent keys.

To illustrate, let’s try to add several entries that have the same key to a std::map:

auto entries = std::map<int, std::string>{};

entries.insert(std::make_pair(1, "one"));
entries.insert(std::make_pair(1, "uno"));

entries.insert(std::make_pair(2, "two"));
entries.insert(std::make_pair(2, "dos"));

entries.insert(std::make_pair(3, "three"));
entries.insert(std::make_pair(3, "tres"));

If we display what the map contains with the following code:

for (auto const& entry : entries)
{
    std::cout << entry.first << '-' << entry.second << '\n';
}

Here is what the code outputs:

1-one
2-two
3-three

For each of the keys (1, 2, 3) there is one entry in the map. Now if we replace the map by a multimap:

auto entries = std::multimap<int, std::string>{};
...

Then the code now outputs:

1-one
1-uno
2-two
2-dos
3-three
3-tres

There are several entries with equivalent keys.

Replacing the map of vectors with a multimap

In our case, we can use a multimap to associate event IDs with receivers, because some event IDs can be associated with several receivers:

class EventMediator
{
public:
    void subscribe(EventReceiver const& receiver, EventID eventID);
    void emit(Event const& event) const;
    bool isRegistered(EventReceiver const& receiver) const;

private:
    std::multimap<EventID, EventReceiver const*> receiversRegistry_;
};

Let’s now rewrite our three methods subscribe, emit and isRegistered to see if this new data structure simplifies their implementation.

subscribe

First of all, the standard multimap doesn’t have an operator[]: indeed, it is possible that more than one value comes out of a lookup into the multimap. So we have to use the insert method:

void EventMediator::subscribe(EventReceiver const& receiver, EventID eventID)
{
    receiversRegistry_.insert(std::make_pair(eventID, &receiver));
}

Which is arguably not as elegant as the implementation using operator[] that we had with the map of vectors. Let’s see how emit and isRegistered do.

emit

Here is the code for the emit function to work with the multimap, we’ll go through it line by line:

void EventMediator::emit(Event const& event) const
{
    auto eventID = event.getEventID();
    auto receiversEntries = receiversRegistry_.equal_range(eventID);
    for (auto receiverEntry = receiversEntries.first; receiverEntry != receiversEntries.second; ++receiverEntry)
    {
        auto const& receiver = receiverEntry->second;
        receiver->reactTo(event);
    }
}

EDIT: as observed by Romain Charbit in the comments section, an std::for_each combined with C++14’s auto in lambdas makes a more concise version:

void EventMediator::emit(Event const& event) const
{
    auto eventID = event.getEventID();
    auto receiversEntries = receiversRegistry_.equal_range(eventID);
    std::for_each(receiversEntries.first, receiversEntries.second, [&event](auto receiverEntry const&)
    {
        auto const& receiver = receiverEntry->second;
        receiver->reactTo(event);
    });
}

In case you’re not yet familiar with the interface of multimap, here is a line-by-line explanation of the above of code:

auto receiversEntries = receiversRegistry_.equal_range(eventID);

When we query a multimap for a key, we don’t expect to get a value back. Indeed, since the multimap could hold several entries for that key, we get a range of entries, which is a slice of the data inside of the multimap:

equal_range multimap C++

This sliced could be empty if there wasn’t any entry corresponding to the queried key.

for (auto receiverEntry = receiversEntries.first; receiverEntry != receiversEntries.second; ++receiverEntry)

While it makes sense that equal_range returns a range, the format of the range returned by the STL here is… not as natural. We would have expected a structure that represents a range, that would have a begin and end interface, but instead equal_range returns a pair of iterators. The first represents the beginning of the range and the second the end.

This integrates badly with the for loop (and with everything else for that matter), hence the complicated above line to simply express “iterate over that range”. Anyway, that’s an issue with the STL that we had already come across when discussing equal_range to search in STL containers.

auto const& receiver = receiverEntry->second;

receiverEntry is an iterator to an entry in the multimap. The multimap contains std::pairs of event IDs and receivers, so to get the receiver we take the second of that entry.

receiver->reactTo(event);

We finally notify the receiver with the event.

Even with the glitch with the interface returned by equal_range, this code is overall more straightforward than the emit we had with the map of vectors. Indeed, we benefit that the structure is not nested to have code that is not nested either.

isRegistered

As in with the map of vectors, our data structure is still not sorted by receiver. So we have to traverse it linearly and search for a given receiver.

But this time, we only have one layer to traverse, which makes it easier to use an STL algorithm. We could use std::find_if, but since we don’t need the location of the searched receiver but only whether it is there or not, std::any_of will let us go straighter to the point:

bool EventMediator::isRegistered(EventReceiver const& queriedReceiver) const
{
    auto hasQueriedReceiver = [&queriedReceiver](auto const& receiverEntry){ return receiverEntry.second == &queriedReceiver; };
    return std::any_of(begin(receiversRegistry_), end(receiversRegistry_), hasQueriedReceiver);
}

Or, with a range-based for loop:

bool EventMediator::isRegistered(EventReceiver const& queriedReceiver) const
{
    for (auto const& receiverEntry : receiversRegistry_)
    {
        if (receiverEntry.second == &queriedReceiver)
        {
            return true;
        }
    }
    return false;
}

Which are both simpler than the nested version of the map of vectors.

Note that the multimap is probably slightly larger in memory than the map of vectors, because the map of vectors only stores one key for each type of event. But until your memory profiler proved that this extra space is indeed significant (keys are often small, and you may not know the number of values per equivalent key), don’t refrain from writing the simplest code.

Map of vectors or multimap?

Even if the map of vectors is maybe more natural to think of at first, the multimap leads to simpler code as soon as we need to iterate over the data. This advantage of the multimap come from the fact that it is not a nested structure, contrary to the map of vector.

But does a nested structure always has to lead to nested code? Not necessarily. If you can abstract the fact that it’s a nested structure behind a range interface, then the code can behave as if it operated on a flat structure.

One thing that performs this type of abstraction is the join range adaptor in range-v3. It can view a vectors of vectors as a flat range that features smart iterators that jumps off a vector to the next and carry out a full traversal of the nested collection as if it was flat.

join works on vectors of vectors. But can it work on maps of vectors? There is an additional level of complexity.

Anyway, until we have that sort of components in production, nested structures produce nested code, and flat structures produce flat code. The apple doesn’t fall far from the tree.

Thanks to Roman for asking my advice to make code more expressive on the SFME project.

You may also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed