Jonathan Boccara's blog

Inheritance Without Pointers

Published January 29, 2021 - 0 Comments

Inheritance is a useful but controversial technique in C++. There is even a famous talk by Sean Parent called Inheritance is the base class of evil. So inheritance is not the most popular feature of the C++ community.

Nevertheless, inheritance is useful, and widely used by C++ developers.

What is the problem of inheritance? It has several problems, and one of them is that it forces us to manipulate objects through pointers.

To illustrate, consider the following hierarchy of classes:

struct Base
{
    // ...
    virtual ~Base() = default;
};

struct Derived : Base
{
    // ...
};

To return a polymorphic object, a function has to use a (smart) pointer:

std::unique_ptr<Base> create()
{
    return std::make_unique<Derived>();
}

Indeed, if it were to return Base by value, the object would be sliced: only the Base part would be returned, and not the Derived part.

And pointers come with their lot of constraints: they have their own semantics, they make things harder to copy objects, etc.

The same problem occurs to store a collection of polymorphic objects in a vector: we have to store pointers instead of values:

std::vector<std::unique_ptr<Base>> collection;

collection.push_back(std::make_unique<Derived>());
collection.push_back(std::make_unique<Derived>());

But when discussing how to use runtime polymorphism without objects and virtual functions, Fluent C++ reader Pavel Novikov shared a technique to use inheritance and virtual functions, without having to use pointers.

This is the most beautiful C++ technique I’ve seen in a long time. Let’s see what it is about.

Motivating example

In order to work on a more fleshed out example than the few lines of code above, let take the example of calculators (simplified) that we used in the article of runtime polymorphism without objects and virtual functions.

The interface of a calculator is this:

struct ICalculator
{
    virtual double compute(int input) const = 0;
    virtual void log(int input, int output) const = 0;
    virtual ~ICalculator() {};
};

Whether to prefix interfaces names with I, as in ICalculator is a hot debate amongst developers. I tend not to use I, but in this case it will come in handy, as you’ll see below.

There are two implementations of this interface: BigCalculator that handles big numbers (greater than 10), and SmallCalculator that handles small numbers:

Here is BigCalculator:

struct BigCalculator : ICalculator
{
   int compute(int input) const override
   {
      return input * 5 ;
   }
 
   void log(int input, int output) const override
   {
       std::cout << "BigCalculator took an input of " << input << " and produced an output of " << output << '\n';
   }
};

And here is SmallCalculator:

struct SmallCalculator : ICalculator
{
   int  compute(int input) const override
   {
      return input + 2;
   }
 
   void log(int input, int output) const override
   {
       std::cout << "SmallCalculator took an input of " << input << " and produced an output of " << output << '\n';
   }
};

Then to have a collection of calculators, we have to use pointers:

std::vector<std::unique_ptr<ICalculator>> calculators;

calculators.push_back(std::make_unique<BigCalculator>());
calculators.push_back(std::make_unique<SmallCalculator>());

And to return a calculator from a function, we also have to use pointers:

std::unique_ptr<ICalculator> createCalculator()
{
    return std::make_unique<BigCalculator>();
}

But there is another way.

Using the value semantics of std::any 

This other way is to store the concrete calculator in a std::any, and to cast it into a ICalculator to access it.

To do that we introduce another component: Calculator (this is why the I in ICalculator is convenient), that represents a calculator. It is a different thing than ICalculator, the interface of the calculator, that represents what the calculator can do but not the calculator itself.

Here is the implementation of Calculator. We analyse it bit by bit just after:

struct Calculator
{
public:
    template<typename ConcreteCalculator>
    Calculator(ConcreteCalculator &&calculator)
    : storage{std::forward<ConcreteCalculator>(calculator)}
    , getter{ [](std::any &storage) -> ICalculator& { return std::any_cast<ConcreteCalculator&>(storage); } }
    {}

    ICalculator *operator->() { return &getter(storage); }

private:
    std::any storage;
    ICalculator& (*getter)(std::any&);
};

Before diving into the implantation, let’s how this is used. To return a calculator from a function:

Calculator createCalculator()
{
    return BigCalculator{};
}

And to have a collection of calculators:

std::vector<Calculator> calculators;

calculators.push_back(BigCalculator{});
calculators.push_back(SmallCalculator{});

The pointers are all gone.

How this is working

To understand how this code is working, let’s start by looking at the data members of Calculator:

    std::any storage;
    ICalculator& (*getter)(std::any&);

storage is the std::any that contains (or points to, if std::any performs a dynamic allocation) the concrete calculator, for example a BigCalculator. And getter is a function pointer that casts this data contained by the any into the base class ICalculator.

Let’s now see how those members are initialised.

storage is initialised with the incoming concrete calculator:

: storage{std::forward<ConcreteCalculator>(calculator)}

That’s pretty straightforward. The initialisation of getter, on the other hand, is where the beauty is:

, getter{ [](std::any &storage) -> ICalculator& { return std::any_cast<ConcreteCalculator&>(storage); } }

At construction of the Calculator, we know the type of the object: it is BigCalculator for example. This is compile information as this is the type of the argument we construct Calculator with.

Based on this information, we can create a getter that casts the any into this particular type. And even later, when we want to access the calculator, and the BigCalculator we passed at construction is no longer around, the information about its type has remained in the code of getter, that casts the any into a BigCalculator.

How beautiful is that?

Isn’t this like a pointer?

To access the calculator, we define a operator-> that returns the ICalculator:

ICalculator *operator->() { return &getter(storage); }

We can then access the methods of the calculator this way:

auto calculator = createCalculator();
output = calculator->compute(42);

But in the end, what is the difference with a pointer? Indeed, with the initial implementation of createCalculator:

std::unique_ptr<ICalculator> createCalculator()
{
    return std::make_unique<BigCalculator>();
}

The calling code would also have looked like that:

auto calculator = createCalculator();
output = calculator->compute(42);

This is the same code!! Is there a point in our new component?

There is a fundamental difference between the two pieces of code. The initial code had pointer semantics. The new code has value semantics.

And value semantics make everything simpler. For example, to copy the calculator and get another instance we can just write:

auto otherCalculator = calculator;

Whereas with pointers, we would have to introduce a polymorphic clone. Also, a pointer can be null, and values can’t.

It is interesting to note that in modern C++, -> does not always means “pointer”. For example std::optional, that has value semantics, also provides an operator-> to access its underlying data.

Also, pointers require to allocate memory on the heap. But when using std::any, in some cases it can be avoided. The C++ standard encourages library implementers to implement a small object optimisation in std::any. This means that for small objects std::any could store them itself and avoid any heap allocation. But this is not guaranteed by the standard, and there is no standard threshold below which this is likely to happen.

Making the component generic

There is nothing specific to calculators in the technique we have seen. We can use it for any hierarchy of classes using inheritance.

We can rename all the terms in the code of Calculator by generic terms:

  • ICalculator is the Interface
  • Calculator is an Implementation
  • ConcreteCalculator is the ConcreteType
  • calculator is the object passed

This gives us that generic code:

template<typename Interface>
struct Implementation
{
public:
  template<typename ConcreteType>
  Implementation(ConcreteType&& object)
  : storage{std::forward<ConcreteType>(object)}
  , getter{ [](std::any &storage) -> Interface& { return std::any_cast<ConcreteType&>(storage); } }
    {}

  Interface *operator->() { return &getter(storage); }

private:
  std::any storage;
  Interface& (*getter)(std::any&);
};

We can reuse that code with other classes. If we we’re to use it with the calculators hierarchy, we would write this:

using Calculator = Implementation<ICalculator>;

and use Calculator as in the code of this article.

The above line sums it all up: Calculator represents an implementation of the ICalculator interface. But it’s not a pointer, it is an object.

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin