The “Extract Interface” refactoring, at compile time

Published April 28, 2017 - 17 Comments

We haven’t talked too much about refactoring on Fluent C++ so far, but this is a topic related to code expressiveness. Indeed, most of the time we don’t start working on production code from scratch, but we rather work on an existing base. And to inject expressiveness into it, this can come through refactoring.

To make a long story short, refactoring goes with tests, and tests go with breaking dependencies.

Indeed, having unit tests covering the code being refactored allows being bold in refactoring while ensuring a certain level of safety. And to test a portion of code, this code has to be relatively independent from the rest of the application, particularly from the parts that really don’t play well with tests, such as UI and database for example.

The “Extract Interface” technique is a classical method to break dependencies that can be found in any good book about refactoring, such as Working Effectively with Legacy Code from Michael Feathers.

My purpose here is to propose a way to perform the Extract Interface technique, in a way that is idiomatic in C++. Indeed, even if they are C++ legal code, I find that typical implementations are directly translated from Java, and I think we can change them to make them fit much better in C++.

Extract Interface

Let’s start by a quick description of what Extract Interface is and what problem it aims at solving. If you are already familiar with it you can safely skip over to the next section.

One of the situations where Extract Interface comes in handy, is breaking a dependency related to an argument passed to a function or a method.

For example, here is a class we would like to get into a unit test:

// In a .h file

class ClassToBeTested
{
public:
    void f(Argument const& arg);
};

Here is what Argument can do:

class Argument
{
public:
    void whoIsThis() const
    {
        std::cout << "This is Argument\n";
    }
    // more methods...
};

and the above method f uses it in its body:

// In a .cpp file

void ClassToBeTested::f(Argument const& arg)
{
    arg.whoIsThis();
}

Let’s imagine that, like some real classes, ClassToBeTested won’t let itself into a test harness, because building an object of type Argument is, say, terribly complicated as it depends on so many other things.

We can then create a new type, TestArgument. It offers the same interface as Argument, so that our ClassToBeTested can use it, but it has a simplified implementation, containing just enough for the purpose of conducting the test.

To materialize this interface we can create an IArgument class, from which both Argument and TestArgument would derive:

Extract interface

The interface of ClassToBeTested becomes:

// In a .h file

class ClassToBeTested
{
public:
    void f(IArgument const& arg);
};

And f can be passed an Argument coming from production code, or a TestArgument coming from the test harness. This is the result of Extract Interface.

Pay only for what you need

The above implementation of Extract Interface works very well in languages such as Java and C#, because inheriting from interfaces with runtime polymorphism is so ubiquitous that these languages do an excellent job optimizing these constructs.

But this is not the case in C++, where this is much less idiomatic.

First off, there is a technical consideration: the above implementation adds runtime polymorphism, which has a cost:

it adds an indirection at each call to the interface, to redirect the execution to the code of the correct derived class,

it makes the objects bigger, typically by adding a virtual pointer to each one, to help in this indirection.

But even if this can be problematic in performance sensitive parts of the code, this cost may be negligible in many situations.

The real problem here is about design: we don’t need runtime polymorphism here. We know when we’re in production code or in test code when invoking the class to be tested, and we know this at the moment of writing code. So why wait until the last moment at runtime to do this check and redirect to the right argument?

We do need polymorphism though, because we want two possible classes to be used in the same context. But this is compile-time polymorphism that we need. And this can be achieved with templates.

Extract “compile-time” Interface

Templates offer a polymorphism of sorts: template code can use any type in a given context, provided that the generated code compiles. This is defining an interface, although not as explicitly stated as in runtime polymorphism with inheritance and virtual functions (although concepts will make template interfaces more explicit, when they make it into the language).

Here is how Extract Interface can be implemented with templates:

// In a .h file

class ClassToBeTested
{
public:
    template<typename TArgument>
    void f(TArgument const& arg)
    {
        arg.whoIsThis();
    }
};

Then you can pass either an Argument or a TestArgument to the method f, and they no longer need to inherit from IArgument. No more runtime polymorphism and virtual pointers and indirections.

However, the template code has to be visible from the point it is instantiated. So it is generally put into the header file, mixing the declaration and the implementation of the method.

“We don’t want that!”, I hear you say, undignified. “We don’t want to show the internals of the method to everyone, thus breaking encapsulation and really increasing compilation dependencies!”

But template code forces us to do this… or does it?

Explicit instantiation

C++ holds a discrete feature related to templates: explicit instantiation. It consists in declaring an instantiation of a template on a particular type, which can be done in a .cpp file.

// In the .cpp file

template void ClassToBeTested::f(Argument);
template void ClassToBeTested::f(TestArgument);

When the compiler sees this, it instantiates the template with the type, generating all the corresponding code, in the .cpp file (if you have heard of the “export” keyword, it has nothing to do with it. If you haven’t… then good for you 🙂 ) Then the implementation of the method no longer needs to be in the header file, because only the explicit instantiation needs to see it.

At this point we may wonder why all template classes don’t use this formidable feature. The answer is because we would need an explicit instantiation to specify each of the types the template can be instantiated with, if we really want to keep implementation in the .cpp file. So for std::vector for example, this feature has no use.

But in our case, we know each of the possible instantiations, and they are just the two of them: Argument and TestArgument. This was actually the whole purpose of the operation!

To sum up where we are know, here is what the header and the implementation files look like:

In the .h file:

class ClassToBeTested
{
public:
    template <typename TArgument>
    void f(TArgument const& arg);
};

In the .cpp file:

#include "ClassToBeTested.h"
#include "Argument.h"
#include "TestArgument.h"

template<typename TArgument>
void ClassToBeTested::f(TArgument const& arg)
{
    arg.whoIsThis();
}

template void ClassToBeTested::f(Argument);
template void ClassToBeTested::f(TestArgument);

Now we can still construct a TestArgument in the test harness without paying for runtime polymorphism, nor displaying the implementation of the method in the header.

There is one more problem left to tackle: the above example #includes the "Argument.h" header. And this header may itself contain dependencies to complicated things that the test harness will have a hard time linking against. It would be nice to somehow avoid #includeing "Argument.h" in the context of the test harness.

The Chinese wall between explicit instantiations

chinese wall This solution has been found by my colleague Romain Seguin.

When you think about it, the only thing we need to include Argument for is the template instantiation. The idea then is to take the explicit instantiations and the include directives out into separate files.

In the production binary:

// file ClassToBeTested.templ.cpp

#include "Argument.h"
#include "ClassToBeTested.cpp"

template void ClassToBeTested::f(Argument);

And in the test binary:

// file ClassToBeTestedTest.templ.cpp

#include "TestArgument.h"
#include "ClassToBeTested.cpp"

template void ClassToBeTested::f(TestArgument);

And the initial implementation file is reduced to:

// file ClassToBeTested.cpp

#include "ClassToBeTested.h"

template<typename TArgument>
void ClassToBeTested::f(TArgument const& arg)
{
    arg.whoIsThis();
}

This way, the test binary does not have to link against anything coming from the header of the Argument production class.

Here is a schema showing all file inclusions (click to enlarge):

(Note that the proposed extension for these files (templ.cpp) is subject to debate. Maybe we should use “.cpp” for them, and rather “.templ.hpp” for the implementation of the template method, which was Romain’s opinion.)

Now over to you

What do you think about this proposed way of performing an Extract Interface in C++? I haven’t found it described anywhere, so it might be either innovative or so wrong that no one cared to talk about it before.

In any case, your impression on this would be very welcome. It’s crazy how questions and thoughts can improve the quality of an idea as a group, so please lads (and ladies!), knock yourselves out.

Don't want to miss out ? Follow:
Share this post!

About Jonathan Boccara