Jonathan Boccara's blog

How to Use Tag Dispatching In Your Code Effectively

Published April 27, 2018 - 7 Comments

Daily C++

Constructors lack something that the rest of the functions and methods have in C++: a name.

Indeed, look at the following code:

class MyClass
{
public:
    MyClass();
    void doThis();
    void doThat();
};

void doSomethingElse(MyClass const& x);

Every routine has a name that says what it does, except for the constructor, which only bears the name of its class.

There is some logic in this though: its a constructor, so its job is to… construct the class. And if it had a name it would be something like constructMyClass, so what’s the point, let’s just call it MyClass and give it a constructor syntax. Fine.

Except this becomes a problem when we need several ways to construct the class: constructMyClassThisWay and constructMyClassThatWay. To remedy to that, constructors can be overloaded:

class MyClass
{
public:
    MyClass();
    MyClass(int i);
    MyClass(std::string s);
};

Which is good, but sometimes not enough. Indeed, sometimes we need several ways to construct a class with the same types of parameters. The simplest example of that is default construction, that is a constructor taking no parameters, to which we want to affect different behaviours.

The thing I want you to see here is that different overloads allow several constructors taking different types of data. But there is no native way to have several constructors taking the same types of data, but with different behaviours.

One way to go about this and to keep code expressive is to use tag dispatching. This is the topic of today: how to use tag dispatching in your code and, just as importantly, when to use it and when to avoid it. In the opinion of yours truly, that is.

How tag dispatching works

If you are already familiar with tag dispatching you can safely skip over to the next section.

The “tag” in tag dispatching refers to a type that has no behaviour and no data:

struct MyTag {};

The point of this is that, by creating several tags (so several types), we can use them to route the execution through various overloads of a function.

The STL uses this technique quite intensively in algorithms that have different behaviours based of the capabilities of the iterator type of the ranges they are passed. For instance, consider the function std::advance, which takes an iterator and moves it forward by a given number of steps:

std::vector<int> v = { 1, 2, 3, 4, 5 };
auto it = v.begin(); // it points to the 1st element of v
std::advance(it, 3); // it now points to the 4th element of v

If the underlying iterator of the collection is a forward iterator then std::advance applies ++ on it 3 times, whereas if it is a random-access iterator (like it is the case for std::vector), it calls += 3 on it. Even if you’re not familiar with this, the bottom line is that std::advance can behave differently depending on a propriety of its iterator.

To implement that, the STL typically uses tag dispatching: the iterator provides a tag (how it provides it is outside of the scope of this article): forward_iterator_tag for forward iterators, and random_access_iterator_tag for random access iterators. The implementation of std::advance could then use something like:

template <typename Iterator, typename Distance>
void advance_impl(Iterator& it, Distance n, forward_iterator_tag)
{
    while (--n >= 0)
        ++it;
}

template <typename Iterator, typename Distance>
void advance_impl(Iterator& it, Distance n, random_iterator_tag)
{
    it += n;
}

and call advance_impl by instantiating the correct tag depending on the capabilities of the iterator. Function overloading the routes the execution to the right implementation.

How to use tag dispatching in your code

Even if it is not as technical as the implementation of the STL, you can still benefit from tag dispatching in your own code.

Let’s take the typical example of a class that has a default constructor (that is, taking no parameter) and where you want this constructor to behave in different ways depending on the context you’re calling it from.

In that case you can define your own tags. You can put them in the scope of the class itself to avoid polluting the global namespace:

class MyClass
{
public:
    struct constructThisWay{};
    struct constructThatWay{};

    // ...

And then you have the associated constructors:

class MyClass
{
public:
    struct constructThisWay{};
    struct constructThatWay{};

    explicit MyClass(constructThisWay);
    explicit MyClass(constructThatWay);

    // ...
};

These are no longer “default” constructors, because they are more than one. They are constructors that take no data, but that can behave in different ways. I used the keyword explicit because this is the default (no pun intended!) way to write constructor accepting one parameter, in order to prevent implicit conversions. When you’re not 100% sure that you want implicit conversion and that you know what you’re doing, better block them.

The call site then looks like this:

MyClass x((MyClass::constructThisWay()));

Note the abundance of parentheses. This feeling of Lisp is a way to work around C++’s most vexing parse, as Scott Meyers calls it in Effective STL, Item 6. Indeed if you don’t double-parenthesize, the following code is parsed as a function declaration:

MyClass x(MyClass::constructThisWay());

(Note that we wouldn’t face the most vexing parse here if there were another parameter passed to the constructor and that wasn’t instantiated directly at call site like the tag is).

One way out of this is to use uniform initialization, with braces {}:

MyClass x(MyClass::constructThisWay{});

But there is another way to have less parentheses or braces: declaring tag objects along with tag types. But this makes for a less concise class definition:

class MyClass
{
public:
    static struct ConstructThisWay{} constructThisWay;
    static struct ConstructThatWay{} constructThatWay;

    explicit MyClass(ConstructThisWay);
    explicit MyClass(ConstructThatWay);
};

While the call site looks a little prettier:

MyClass x(MyClass::constructThatWay);

No more most vexing parse nor braces, since the argument is no longer a type. But this leads to more code in the class definition. It’s a trade-off. You choose.

Finally, whichever way you decide to go with, nothing prevents you from having a real default constructor that takes no parameters, on the top of all that:

class MyClass
{
public:
    static struct ConstructThisWay{} constructThisWay;
    static struct ConstructThatWay{} constructThatWay;

    MyClass();
    explicit MyClass(ConstructThisWay);
    explicit MyClass(ConstructThatWay);
};

Why not use enums instead?

A natural reaction when you first see this technique of tags in business code is to wonder: wouldn’t using an enum be a less convoluted way to get the same results?

In fact there are notable differences between using enums and using tags, and since there are quite a few things to say about that I’ve dedicated an entire post to when to use tag dispatching and when to use enums, coming next up in this series.

So back to tag dispatching.

When to use tag dispatching in your code

Use tag dispatching to provide additional information on behaviour, and strong types to provide additional information on the data.

My take on tag dispatching is that is should be used to customize behaviour, and not to customize data. Said differently, tag dispatching should be used to supplement the data passed to a constructor, with additional information on behaviour.

To illustrate, I’m going to show you a bad example of usage of tag dispatching. This is a class that represents a circle, that can be constructed either with a radius or with a diameter. Both a radius and a diameter are numeric values of the same type, expressed say with double.

So a wrong usage of tag dispatching is this:

class Circle
{
public:
    struct buildWithRadius{};
    struct buildWithDiameter{};

    explicit Circle(double radius, buildWithRadius);
    explicit Circle(double diameter, buildWithDiameter);
};

What is wrong in this code is that the information about the data is spread over several arguments. To fix this we can use strong types rather than tag dispatching to add information to the data:

class Circle
{
public:
    explicit Circle(Radius radius);
    explicit Circle(Diameter diameter);
};

Curious about strong types? Check out this series of posts on strong types!

So use tag dispatching to provide additional information on behaviour, and strong types to provide additional information on the data.

If you find this guideline reasonable, you may wonder why the STL doesn’t follow it. Indeed, as seen above, the dispatch tags on the iterator categories are passed along with the iterator itself.

Not being a designer of the STL I could be wrong on that, but I can think of this: since the algorithm gets the iterator category from the iterator in a generic way, it would need a template template parameter to represent the strong type. Like ForwardIterator to be used like this: ForwardIterator<iterator>. And from the implementation of the iterator it may be less simple that specifying a tag. Or maybe it’s more code to define strong types. Or maybe it’s related to performance. Or maybe they just didn’t think about it this way. Frankly I don’t know, and would be glad to have your opinion on that.

Anyway in your own business code, when there is no generic code creating intricate design issues, I recommend you to use tag dispatching to provide additional information on behaviour, and strong types to provide additional information on the data. It will make your interface that much clearer.

Related posts:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed