Jonathan Boccara's blog

Overview of std::map’s Insertion / Emplacement Methods in C++17

Published December 11, 2018 - 2 Comments

Today’s guest post is written by @walletfox, one of the hitchhikers in the C++ galaxy, trying to navigate its dark corners by writing articles, creating Mostly Harmless cheat sheets and observing the following rules: “Don’t panic! Always carry a cheat sheet next to the towel. So long and thanks for all the fish.”.

Interested in writing on Fluent C++ too? Submit your post!

TL;DR Go ahead and try out the C++17 emplacement / insertion methods. They offer a tangible improvement in terms of expressiveness and code safety.

insert_or_assign try_emplace map C++

Here are examples of code using them.

C++17 introduced two new insertion / emplacement methods for std::map, namely try_emplace() and insert_or_assign().

At first sight, this might seem like a cause for concern. Why new insertion methods? Don’t we have already plenty of them?

Nevertheless, by studying the problem closer we might come to the conclusion that the introduction of the new methods try_emplace() and insert_or_assign() makes a convincing case and that they bring us closer to more expressive and safer code.

To understand how we ended up with this pool of insertion / emplacement methods, we are going to use a simple map<int, std::string>. Later on, we are going to look at a more complex scenario with std::map<int, A> where A is a struct with two member variables (std::string, int).

For logging purposes I have provided all the special member functions for the struct A. In production, we should apply The Rule of Zero (C++ Core Guidelines, C.20: If you can avoid defining default operations, do) and let the compiler generate the special member functions for us.

Unless stated otherwise, the code was compiled with gcc 9.0.0. and clang 8.0.0, -O2 (HEAD at the time of writing).

Pre-C++11 days: operator[] and insert()

Before we can discuss how exactly the new  C++17 methods try_emplace() and insert_or_assign() bring improvement, we are going to travel back to pre-C++11 times when all we had was operator[] and insert().

The selling point of operator[] was its simplicity of use, which unlike insert() didn’t need to use std::make_pair() or other verbose constructs to pass around function arguments.

insert_or_assign try_emplace map C++

Fig. 1: The difference between the original insertion methods, C++03

Convenience aside, what is more important, operator[] differs from insert() in how it handles a situation when an element with the given key already exists in the map. While operator[] simply overwrites the corresponding value, insert() doesn’t.
Another important difference lies in the requirements on the value_type, namely, operator[] requires a DefaultConstructible value_type, which means that if we explicitly or implicitly disable the default constructor of the struct A, the code won’t compile. Notice that, unlike insert(),  operator[] calls different special member functions, i.e. the call to the default constructor is followed by the call to copy/move assignment operator.
Last but not least, these methods differ in the return information they provide. With operator[], we have no way of finding out whether the insertion actually took place, unless we perform a prior lookup. On the other hand, insert() returns a pair<iterator, bool> that provides us with this information.

Most recently, this has been simplified thanks to structured bindings introduced in C++17.

C++11: move semantics and in-place construction

Further down the road, we got C++11 which introduced move semantics, and both operator[] and insert(), i.e. the original insertion methods, benefited from this in terms of performance. In addition, C++11 introduced emplace() which has the same functionality as insert() but additionally, enables in-place construction.

insert_or_assign try_emplace map C++

Fig. 2: Introduction of emplace(), C++11

In-place construction is a technique that bypasses construction and destruction of temporaries by constructing the objects directly in the map. A notable attraction of emplace() is that we can do away either with std::make_pair() or the extra pair of {} that needed to be used with insert(). Emplacement is accomplished via perfect forwarding and variadic templates.

The jury is still out on whether emplace() should be generally preferred to insert(). The potential performance gain is dependent on the types involved and specific library implementations. While Scott Meyers is in favor of emplace() (Effective Modern C++, Item 42, what a coincidence!), other C++ experts/guidelines are in favor of insert(), most notably Bjarne Stroustrup and Abseil Common Libraries. The reason for that is code safety.

Clang-tidy uses a mixed approach with a general preference for emplacement with the exception of std::unique_ptr and std::shared_ptr where emplacement could lead to memory leaks:

Let’s get back to our example and study the effect of different insertion/emplacement constructs. Though this will provide us with some observations, keep in mind that this is a specific example. The types and specific libraries involved are likely to cause differences and it would be counterproductive to make general conclusions. If in doubt, measure.

Now that we have listed some common alternatives, notice that scenario (1) resulted in a copy constructor call with both compilers. This is because of copy-list-initialization.

If performance is of concern, we can disable this alternative by marking the multi-argument constructor of struct A explicit. This code will then fail to compile:

It appears that omitting make_pair() with emplace() in case (5) helped us dispense with one move construction, but we can do even better—this is demonstrated in case (7) where we passed std::piecewise_construct and std::forward_as_tuple as arguments to emplace() resulting in a single constructor and destructor call, completely avoiding intermediate copies and moves!

The verbosity of emplacement with piecewise construct is off-putting, therefore you might appreciate C++17’s try_emplace() which will do away with the gobbledegook. This is going to be demonstrated in the next section.

For reasons of completeness, I am also listing scenarios where we move from L-values. As you can see,  contrary to the previous example, we don’t get the same benefit with emplace() and piecewise construct as before.

C++17: try_emplace() and insert_or_assign() as a solution to double lookup

Now we have enough background to understand the rationale behind the introduction of the new methods. try_emplace() and insert_or_assign() differ in their respective functionalities, but they do have something in common—they are both a solution to a redundant search that had to be performed in pre-C++17 days to provide safety or additional information.

insert_or_assign try_emplace map C++

Fig. 3 C++17’s try_emplace() and insert_or_assign()

try_emplace()

try_emplace() is a safer successor of insert() or emplace(). In line with insert() and emplace(), try_emplace() doesn’t modify values for already inserted elements. However, on top of that, it prevents stealing from original arguments that happens both with insert() and emplace() in case of a failed insertion.

This is demonstrated in the snippet below. An element with key 1 is already in the map, as a result p1 won’t be inserted. That doesn’t prevent emplace() from plundering the pointer p:

In the pre C++17 days this issue could have been only solved with a prior lookup, with find().

This lookup is no longer necessary. try_emplace() makes sure that the argument remains untouched in case it wasn’t inserted:

Although this is the primary purpose of try_emplace(), there are some other important advantages. As already mentioned in the previous section, try_emplace() simplifies the original emplace() that had to use pair’s piecewise constructor:

and dispenses with its verbosity in the following way:

At first sight, using try_emplace() in this way might seem rather user unfriendly due to the nonexistent boundary between the key and the value. However, if used in this way, try_emplace() solves another issue of emplace(), namely that objects were created even though they weren’t actually used.

Specifically, the map below already contains the key 1 with value {“Ann”, 63}, thus a {“Ben”, 47} object doesn’t need to be generated, because emplace() doesn’t modify values for already existing keys.

Nonetheless, we shouldn’t  blindly replace all occurrences of emplace() with try_emplace() without adjusting the argument list first. The try_emplace() that uses A’s constructor below generates a redundant object just like its emplace() counterparts:

insert_or_assign()

insert_or_assign() is a “smarter” successor of operator[]. Just like operator[] it modifies values if supplied with a key that is already present in the map. However, unlike operator[], insert_or_assign() doesn’t require default constructibility of the value_type. On top of that, it returns a pair<iterator, bool>. The bool is true when insertion took place and false in case of assignment.

Again, this information was unavailable for operator[] without a prior lookup with the help of find() as demonstrated below. The map already contains an element with the key 1, thus this won’t be an insertion but an update.

The code contains a lot of boilerplate that can result both in errors and performance inefficiencies only for the sole purpose of insert or update identification. Luckily, with insert_or_assign() we can skip all of it and simply write:

Difficulties with inferring from names

At present, it is difficult to conclude whether the new C++17 methods express clearly their intent and functionality. If you have a look at the original proposal, try_emplace() is being referred to as emplace_stable(), while insert_or_assign() is being referred to as emplace_or_update().

At the moment it might seem confusing but with more frequent use we are bound to get it right and hopefully, we will be able to link the new names to the correct functionalities.

Summary

Remember that:

  • insert(), emplace() and try_emplace() don’t overwrite values for existing keys. On the other hand, operator[] and insert_or_assign() overwrite them.
  • emplace() may be susceptible to memory leaks if allocation fails.
  • try_emplace() doesn’t steal from original arguments if insertion fails. This is in contrast to emplace() and insert().
  • try_emplace() doesn’t generate redundant objects in case insertion didn’t take place. This is in contrast to emplace().
  • try_emplace() offers a simplified piecewise construction. On the other hand, emplace() has to use std::piecewise_construct, std::forward_as_tuple.
  • insert_or_assign() doesn’t require default constructibility. On the other hand, operator[] does.
  • insert_or_assign() returns information on whether insertion or assignment took place. This is in contrast to operator[].

The author is grateful to Jonathan Boccara for hosting, formatting and editing the post and Ricardo Nabinger Sanchez for proofreading.

You will also like

Become a Patron!
Share this post! Facebooktwittergoogle_pluslinkedin    Don't want to miss out ? Follow:   twitterlinkedinrss

Get a free ebook on C++ smart pointers

Get a free ebook of more than 50 pages that will teach you the basic, medium and advanced aspects of C++ smart pointers, by subscribing to our mailing list! On top of that, you will also receive regular updates to make your code more expressive.

No spam. You can unsubscribe at any time.