Jonathan Boccara's blog

Make Your Containers Follow the Conventions of the STL

Published April 24, 2018 - 8 Comments

Daily C++

One day I had to do a little refactoring that consisted in renaming a method called getSize() into size(), because I needed to pass its class to generic code that expected a method size(). And what made this refactoring a little special is that this class was used very widely across a pretty large codebase.

This is not something you want to spend time on, is it?

It could have been avoided if the class had been designed from the start with the conventions of the STL in mind, where all containers have a .size() method. This episode of the STL Learning Resource is a reminder of the importance of conventions, in particular those of the STL.

The importance of following conventions

Conventions relieve some of the burden of understanding code

When approaching a particular piece of code, the reader has at least two types of information to parse in order to understand it: its semantics, and the style it is written with.

While as developers we all have our unique styles (ever looked at a piece of code and thought, “that doesn’t look like me”?), some of it can be harmonized across the people working on the same codebase, by using coding conventions.

Those elements of style, shared by all developers on a project, take away a part of the load you need to figure out when reading code.

Conventions range over a wide variety of topics.

They can be as mindless as choosing where to put the opening bracket of a block: at the end of a line:

if (condition) {
    ...
}

or at the beginning of a new line:

if (condition)
{
    ...
}

On this particular example, none seems to be objectively best though. In Code Complete, Steve McConnell mentions one study that “found no statically significant difference between the two as far as understandability is concerned.” He goes on saying, “once you’ve chosen a style, you reap the most benefit from good layout when you apply it consistently.” Hence the idea of having a convention and sticking to it.

But conventions aren’t just about layout, and some are closer to the semantics, like we’ll see in a minute with the STL.

Generic code relies on conventions

If you want your code to be compatible with a piece of template code, you need it to have exactly the names that the template code expects. One such name could be size for example. This is true with the templates of today that perform duck typing, and should remain true even when concepts get into the language.

The name in itself doesn’t matter. What does matter is that both template and client code have the same reference.

Note that this is true even if you don’t use templates too much in your code. You could benefit from generic code that does, such as the STL algorithms, and which could perform fantastic things with your classes if you’d just let them, by following certain conventions.

From the implementer’s point of view

On the other side, when writing generic code, it is useful to think about what conventions our piece of template code needs to be instantiated. This is what concepts are supposed to make explicit, when they get into the language.

To make template code usable by as many clients as possible, we can try to relieve some of the requirements on client code. For instance, we could consider using std::distance(begin(x), end(x)) instead of x.size. Boost Ranges does this, for example.

Or we can even make functions that detect which functionalities client code has and use the ones that it does have.

The conventions of the STL 

When making a container class, following the conventions of the STL containers offers two advantages:

  • they make it easy for a reader used to the STL to understand how to use the class,
  • they allow to re-use generic code operating on containers, including standard algorithms and homemade libraries.

Here are some conventions used by STL containers, and that your container classes should follow.

begin and end

Like we saw with the design of the STL, the most profitable feature to add to our container classes is probably adding begin and end methods to it. This makes our classes compatible with the powerful STL algorithms. You can refer to the article for more details about what those methods should return.

size

This was our motivating example. To illustrate this, let’s consider the std::equal algorithm, which compares the elements of two collections and returns true if there are respectively equal.

Like all STL algorithms, std::equal takes begin and end iterators. To improve it with range semantics and make it accept two collections directly, we can wrap it this way:

template<typename Range1, typename Range2>
bool equal(Range1 const& range1, Range2 const& range2)
{
    return std::equal(begin(range1), end(range1), begin(range2));
}

However before C++14, std::equal is one of the “1.5 ranges” algorithms, which means that it only takes the begin iterator of the second sequence, and not the end. So if the first range is longer than the second one, then the algorithm carries on beyond the bounds of the second range, leading to undefined behaviour.

One way to make sure this doesn’t happen is to check that the two ranges are of the same size. Moreover, if they don’t, then no need to compare elements because we know for sure that we should return false.

So a possible fix could be:

template<typename Range1, typename Range2>
bool equal(Range1 const& range1, Range2 const& range2)
{
    if (range1.size() != range2.size()) return false;

    return std::equal(begin(range1), end(range1), begin(range2));
}

This invokes the method size, which works on all STL containers. To make this version of equal work on your containers too, they’d have to implement a method called size. Not getSize, nor any other name.

Even if on this particular example we could consider making equal rely on other ways to get the size of the ranges (as discussed above), following the convention of the size name makes your code more likely to work with this kind of generic code.

EDIT: as observed by Malcolm in the comments section, note that in C++17 we could use std::size(range1).

push_back

To add a method that inserts an element at the end of your container class, call it push_back. Not pushBack nor add nor even append. Just push_back.

Doing this makes your class compatible with std::back_inserter, which allows to use the container as the output of an algorithm, such as std::transform for instance. Indeed, std::back_inserter binds to a container and calls its push_back method whenever it is send an element:

std::vector<int> numbers = {1, 2, 3, 4, 5};
MyCollection results;
std::transform(begin(numbers), end(numbers), std::back_inserter(results), [](int number) { return number * 2; });

// compiles only if MyCollection has a push_back method

insert

Similarly to the push_back method for using std::back_inserter, std::inserter needs a method named insert and that takes two parameters: the position to insert and the value to insert, in this order.

For sorted containers, it doesn’t make sense to require a position to insert (unless the client code knows it and gives a hint to the container). However std::inserter requires a position to insert regardless. If you need an insert iterator on a sorted container, check sorted_inserter that doesn’t require a position to insert.

clear

All STL containers have a clear method that removes all its elements. This is a convention to follow too, so no removeAll, clean and not even Clear with a capital letter.

erase and remove

How to remove some components in an STL container is a topic rich enough that it deserves its own article.

But about convention, most STL containers have an erase method to remove elements, except std::list and std::forward_list that have a remove method. But those two containers practically never get used anyway.

An integral value in a constructor means size, not capacity

Some STL containers including std::vector have a constructor that takes a size_t parameter. This constructor creates a vector with as many elements constructed by default (on their constructor taking no parameter).

I’ve seen custom containers taking a size_t in their constructor, but that did a different thing, such as allocating a memory buffer to be able to store that many elements without additional allocation. Said differently, this parameter in this class’s constructor had a semantics of a capacity, whereas the one in std::vector has the semantics of a size. Not following this norm creates confusion.

aliases

STL containers have a set of aliases or nested class that allow generic code to retrieve information related to types. This include iterator, value_type, etc.

If you want such generic code to retrieve information from your container as well, then it should have similar aliases, with exactly the same names.

class MyContainer
{
public:
    using value_type = // your value type
    using iterator = // your iterator type
    // ...
};

You reap what you sow…

… so unless you want to reap confusion, stupid refactorings and no compatibility with powerful existing libraries, design your classes by following conventions.

The above are those to follow when designing a container. And please, do let me know if you see one I forgot to include in this list!

You may also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed