Strong Types for Safe Indexing in Collections – Part 1

Published October 31, 2021

Strong types make code safer and more expressive by using the type system to identify individual objects.

For example, to instantiate a class Rectangle with a certain width and height, we could write this:

Rectangle myRectangle{4, 5};

But then it isn’t clear for a reader of the code which of the two parameters is the width and which one is the height. Which one is 4? Which one is 5?

This makes code difficult to understand, and difficult to get right too. Indeed, swapping the parameters by mistake is a common source of bugs.

An alternative is to introduce new types, Width and Height, and make the constructor accept them instead of primitive types:

Rectangle myRectangle{Width{4}, Height{5}};

This makes the code much more expressive and safer.

Strong typing is a very rich topic (you can find dozens of articles on strong types on Fluent C++) and helps make code more expressive in many ways.

Let’s focus on one of those ways: using strong types for safe indexing in collections.

Using the right index

The need for “strong indexing” came from an issue raised in the NamedType library (an implementation of strong types for C++): how can we use strong types to make sure to use the correct index when working with several collections?

Let’s use std::vector to represent the collections here. We have two vectors:

std::vector<int> foos = {1, 2, 3};
std::vector<int> bars = {10, 20};

And we’d like to have an index for each vector, that can only be used for that vector. This way, we make sure not to use an index with the wrong vector.

Let’s create two separate indices by using strong types:

using FooIndex = NamedType<size_t, struct FooTag, PreIncrementable, Comparable>;
using BarIndex = NamedType<size_t, struct BarTag, PreIncrementable, Comparable>;

Those are two different types wrapping a size_t and that can be incremented and compared.

Then we’d like this code to compile:

for (FooIndex fooIndex = FooIndex{0}; fooIndex < FooIndex{foos.size()}; ++fooIndex)
{
    for (BarIndex barIndex = BarIndex{0}; barIndex < BarIndex{bars.size()}; ++barIndex)
    {
        std::cout << foos[fooIndex] << '-' << bars[barIndex] << '\n'; // ok, correct indices
    }
}

And we would like the following code to fail to compile:

for (FooIndex fooIndex = FooIndex{0}; fooIndex < FooIndex{foos.size()}; ++fooIndex)
{
    for (BarIndex barIndex = BarIndex{0}; barIndex < BarIndex{bars.size()}; ++barIndex)
    {
        std::cout << foos[barIndex] << '-' << bars[fooIndex] << '\n'; // oops, wrong indices!
    }
}

How do we do this?

Unless we change the code of the implementation of a standard library, we can’t write the exact above pieces of code. Indeed, std::vector‘s operator[] doesn’t take a FooIndex or a BarIndex, to begin with.

But we can adapt the code a little to make it valid. We’ll see two different ways:

introducing a strongly indexed vector (this post),
creating a strongly indexed reference an a normal std::vector (the next post).

A strongly indexed vector

What prevents us from writing the above code is that std::vector doesn’t have the interface we need: it doesn’t accept FooIndex and BarIndex. Let’s not use vector then, but introduce a new container instead!

On the other hand, it would be a shame to give up on everything vector provides, and code it up from scratch ourselves, just for the purpose of tweaking the operator[].

It would be great to reuse std::vector for everything except operator[].

There are at least three ways to do that: public inheritance, private inheritance and composition. Let’s start with public inheritance, that require the least code to write.

Public inheritance

To reuse all the interface of std::vector, we can inherit from it. Here is the code, we’ll explain it bit by bit just after:

template<typename T, typename Index>
class StrongIndexVector : public std::vector<T>
{
public:
    StrongIndexVector() = default;
    explicit StrongIndexVector(typename std::vector<T>::size_type count, const T& value = T()) : std::vector<T>(count, value) {}
    template< class InputIt >
    StrongIndexVector(InputIt first, InputIt last) : std::vector<T>(first, last) {}
    StrongIndexVector(std::initializer_list<T> init) : std::vector<T>(std::move(init)) {}

    typename std::vector<T>::reference operator[]( Index pos )
    {
        return std::vector<T>::operator[](pos.get());
    }

    typename std::vector<T>::const_reference operator[]( Index pos ) const
    {
        return std::vector<T>::operator[](pos.get());
    }
};

Let’s start with the first line:

template<typename T, typename Index>

Like std::vector, our class can store values of any type T. It also has a specific Index type, that would be in our initial example FooIndex or BarIndex.

Let’s skip to the end of the class:

    typename std::vector<T>::reference operator[]( Index pos )
    {
        return std::vector<T>::operator[](pos.get());
    }

    typename std::vector<T>::const_reference operator[]( Index pos ) const
    {
        return std::vector<T>::operator[](pos.get());
    }
};

We use this index to achieve our purpose and have an operator[] that only works with the specific index. This operator[] hides the one of the base class std::vector (read Item 33 of Effective C++ to learn more about this mechanism).

The rest of the code allows to reuse everything else from std::vector:

class StrongIndexVector : public std::vector<T>
{
public:
    StrongIndexVector() = default;
    explicit StrongIndexVector(typename std::vector<T>::size_type count, const T& value = T()) : std::vector<T>(count, value) {}
    template< class InputIt >
    StrongIndexVector(InputIt first, InputIt last) : std::vector<T>(first, last) {}
    StrongIndexVector(std::initializer_list<T> init) : std::vector<T>(std::move(init)) {}

The call site then looks like this:

using FooIndex = fluent::NamedType<size_t, struct FooTag, fluent::PreIncrementable, fluent::Comparable>;
using BarIndex = fluent::NamedType<size_t, struct BarTag, fluent::PreIncrementable, fluent::Comparable>;

StrongIndexVector<int, FooIndex> foos = {1, 2, 3};
StrongIndexVector<int, BarIndex> bars = {10, 20};

for (FooIndex fooIndex = FooIndex{0}; fooIndex < FooIndex{foos.size()}; ++fooIndex)
{
    for (BarIndex barIndex = BarIndex{0}; barIndex < BarIndex{bars.size()}; ++barIndex)
    {
        std::cout << foos[fooIndex] << '-' << bars[barIndex] << '\n';
    }
}

The first two lines create two strong types over a size_t, in order to have two different types of indices.

Although using public inheritance works here, it’s arguably not the optimal solution, because it has several drawbacks. If a StrongIndexVector is (implicitly) cast into a std::vector, then the native operator[] of std::vector is available again and we’re back to square one.

Also, this is less likely to happen but if a StrongIndexVector is dynamically allocated, then deleted through a pointer to its base class std::vector, then we get to undefined behaviour.

Advantages:

Little code

Drawbacks:

Not ideal when cast to base class

Let’s explore the alternative of private inheritance then.

Private inheritance

As Federico demonstrates in his post on restricting interfaces, private inheritance provides an interesting trade-off to reuse code in an expressive way.

By default, private inheritance doesn’t expose anything from the interface of the base class. We have to add back whatever we want to reuse from the base class with using declarations. In our case, we want to reuse everything except operator[]. And then we write our own operator[](highlighted):

template<typename T, typename Index>
class StrongIndexVector : private std::vector<T>
{
public:
    StrongIndexVector() = default;
    explicit StrongIndexVector(typename std::vector<T>::size_type count, const T& value = T()) : std::vector<T>(count, value) {}
    template< class InputIt >
    StrongIndexVector(InputIt first, InputIt last) : std::vector<T>(first, last) {}
    StrongIndexVector(std::initializer_list<T> init) : std::vector<T>(std::move(init)) {}
    StrongIndexVector(StrongIndexVector const& other) = default;
    StrongIndexVector(StrongIndexVector&& other) = default;

    typename std::vector<T>::reference operator[]( Index pos )
    {
        return std::vector<T>::operator[](pos.get());
    }

    typename std::vector<T>::const_reference operator[]( Index pos ) const
    {
        return std::vector<T>::operator[](pos.get());
    }

    using typename std::vector<T>::value_type;
    using typename std::vector<T>::allocator_type;
    using typename std::vector<T>::size_type;
    using typename std::vector<T>::difference_type;
    using typename std::vector<T>::reference;
    using typename std::vector<T>::const_reference;
    using typename std::vector<T>::pointer;
    using typename std::vector<T>::const_pointer;
    using typename std::vector<T>::iterator;
    using typename std::vector<T>::const_iterator;
    using typename std::vector<T>::reverse_iterator;
    using typename std::vector<T>::const_reverse_iterator;

    StrongIndexVector& operator=(StrongIndexVector const& other) = default;
    StrongIndexVector& operator=(StrongIndexVector&& other) = default;
    using std::vector<T>::operator=;

    using std::vector<T>::assign;
    using std::vector<T>::get_allocator;
    using std::vector<T>::at;
    using std::vector<T>::front;
    using std::vector<T>::back;
    using std::vector<T>::data;
    using std::vector<T>::begin;
    using std::vector<T>::cbegin;
    using std::vector<T>::end;
    using std::vector<T>::cend;
    using std::vector<T>::rbegin;
    using std::vector<T>::crbegin;
    using std::vector<T>::rend;
    using std::vector<T>::crend;
    using std::vector<T>::empty;
    using std::vector<T>::size;
    using std::vector<T>::max_size;
    using std::vector<T>::reserve;
    using std::vector<T>::capacity;
    using std::vector<T>::shrink_to_fit;
    using std::vector<T>::clear;
    using std::vector<T>::insert;
    using std::vector<T>::emplace;
    using std::vector<T>::erase;
    using std::vector<T>::push_back;
    using std::vector<T>::emplace_back;
    using std::vector<T>::pop_back;
    using std::vector<T>::resize;
    using std::vector<T>::swap;
};

This can be a little unsettling as private inheritance is not so common in production code. But I don’t think this is a real drawback, since as we saw in the The Common Vocabulary of Software Developers, we should level up to the standard coding techniques, and not the other way round.

Advantages:

Not castable to base class

Drawbacks:

A bit long to write (but feel free to copy-paste!)

Composition

Composition is the solution that is commonly seen as the most reasonable, because it doesn’t use inheritance and inheritance is generally frowned upon in design when it’s not absolutely necessary.

Composition consists in storing a std::vector as a data member of StrongIndexVector, and wrap each function of its interface. For example, for push_back, we’d write:

template<typename T, typename Index>
class StrongIndexVector
{
public:

    // ...

    void push_back(T const& value)
    {
        vector_.push_back(value);
    }

    void push_back(T&& value)
    {
        vector_.push_back(std::move(value));
    }

    // ...
    
private:
    std::vector<T> vector_;
};

And we would also write our own version of operator[] as in the previous code using inheritance.

This represents loads of code, and I think it brings little more than private inheritance.

Advantages:

More conventional

Disadvantages:

Loads of code

A strongly indexed reference

So far we’ve seen how to design a container with a special operator[]. But there is another approach: using a proxy on a regular std::vector, and implement our operator[] on the proxy.

We’ve seen a lot today, and we’ll keep this for the next post. In the meantime I suggest that you implement that proxy idea on your own, because it’s a good C++ exercise. Don’t forget that the incoming vector could be const or not const, and that it can be an lvalue or an rvalue!

More on that in the next article. Stay tuned!