Jonathan Boccara's blog

A Minimal Interface: Both Expressive And Fast Code

Published December 5, 2017 - 0 Comments

Minimal interface

Have you ever used std::inserter to insert the outputs of an STL algorithm into a sorted container such as an std::set?

And if you have, weren’t you annoyed by how its interface forces you to specify the position to insert the elements in the set?

I find this very annoying, because most of the time we have no idea where they should go inside the set at the time we write code. We don’t even know their values in advance. That’s the set‘s job to figure out where to put the new elements and keep a sorted order.

So we end up sticking the begin or the end of the set as an argument to std::inserter, and this useless piece of information sits like an uninvited guest in the middle of the elegant STL party:

std::vector<int> v = {1, 3, -4, 2, 7, 10, 8};
std::set<int> results;
 
std::copy(begin(v), end(v), std::inserter(results, end(results)));

We’ve previously come across sorted_inserter, that does the same thing as std::inserter except you don’t have to specify where to insert the elements. You can specify it, if you happen to know, and it will save time to the set instead of searching its location for you in this case. But otherwise the set takes care of it (just like when we call its .insert method):

std::vector<int> v = {1, 3, -4, 2, 7, 10, 8};
std::set<int> results;
 
std::copy(begin(v), end(v), sorted_inserter(results));

By removing the call to the end iterator, sorted_inserter makes for more direct code. But does this have an impact on performance? The point of this post is to compare the performance of sorted_inserter with the standard std::inserter.

For the sake of the example we’ll use std::copy because it is the simplest STL algorithm, but sorted_inserter could be used with other algorithms too. And as Reddit user FbF_ noted, in particular this doesn’t mean that we should use std::copy to add data to a container, as there are better ways of inserting several elements into an STL container efficiently.

Measure, measure, measure… fine let’s do it!

For this benchmark I’ll use Fred Tingaud’s increasingly popular tool, Quick-Bench.

The test case we use here is this:

  1. construct a vector<int> containing 100 values randomly generated between -100 and +100,
  2. copy the contents of this vector into a set<int>, by using std::copy and std::inserter(results, end(results))
  3. repeat 2) a large number of times, and measure the average time
  4. divide it by the time taken by an empty benchmark, in order to have a no-op reference

These are the results in blue below.

Maybe passing begin(results) is better than end(results)? I’ve thrown in a new test case (it’s very easy to do with quick-bench) to measure this. These are the results in pink below.

Finally, I’ve included a test case that uses sorted_inserter instead of std::inserter, represented by the results in yellow below.

Here are the visual results:

inserter performance

This results allow us to interpret two things:

  • if you’re unsure what to put as a location in std::inserter, begin and end seem equivalent in terms of performance,
  • sorted_inserter is faster than std::inserter. The above show a performance increase of 44%. This benchmark was done in O3 (for the other levels of optimisation the increase was closer to 20%).

Here is the quick-bench run for this test if you’d like to play around with it.

A minimal interface

Why does sorted_inserter outperform the STL? It’s certainly not coming from a more efficient implementation, because the STL one is surely much better implemented.

I think the problem of std::inserter is its interface: it’s doing too many things at the same time.

Indeed, it makes sense to specify a position for a vector, because it can’t find it on its own. So std::inserter‘s interface does make sense for vector. But it’s also trying to work for a set. It’s trying to fit all containers at the same time.

And std::inserter is sending the set on the wrong track, by consistently providing a hint that is not the right one. That’s more work for the set than not giving a hint at all, because the set tries out the hint before realizing it was wrong, and then it still has to insert the element.

sorted_inserter rather provides a minimal interface (just a container, no position), but it is specific to sorted containers, and doesn’t make sense on vectors. And it also provides the more elaborate interface that lets its user provide a hint, even if it is a less common use case.

I think a lesson to draw out of this analysis is that it is useful to provide at least one minimal interface, that perfectly satisfies the most basic need. Here this interface would consist in inserting into a sorted container without preliminary information about the final location of the inserted components. This is particularly important if this need occurs often, as it is the case withstd::inserter on std::set.

This way, we’ll have better chances of designing interfaces allowing both expressive and fast code.

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin