Jonathan Boccara's blog

How to Use the STL With Legacy Output Collections

Published November 24, 2017 - 4 Comments

STL legacy code

When you start using the STL and its algorithms in your code, it’s a bit of a change of habits. And then after a while you get used to it. Then it becomes a second nature. And then even your dreams become organized into beautifully structured ranges that fly in and out of well-oiled algorithms.

And when you reach that point, there is no coming back.

Until the day you come upon an old legacy structure that won’t let itself approached by the elegant and expressive way of coding that STL algorithms have. It’s a terrible encounter, where the beast tries to suck you back into the lengthy and dangerous quicksand of the raw for loops that now seemed so far away.

I’ve faced that day with my valiant colleague Gauthier, and together we drove an epic fight until we forced the beast into a several-inch thick STL prison, where it could no longer harm the rest of the code. Ok, it wasn’t that epic. But anyway, let me tell you that tale so that you can use it if you face a similar situation. We’ll see the main component that allowed us to do this, custom_inserter, so that you don’t need to dress up for this fight again (I later realized that something very close existed in Boost, boost function output iterator, so you’ll prefer that if you can use Boost in your code).

In other words, let’s see how to use the STL algorithms with legacy input and outputs.

We’ve already touched upon legacy or user-defined inputs, by studying the design of the STL. So now we’ll focus on how to output the results of an algorithm into a legacy structure that wasn’t designed to be compatible with the STL.

STL legacy code

The case

I’m going to simplify the use case to the bare minimum to spend the less amount of time understanding it.

We have a collection of inputs, say in the form of a vector:

std::vector<Input> inputs = //...

and a function f that we want to apply to each one of them:

Output f(Input const& input);

This will result into as many Outputs. And we need to feed these outputs to an object that isn’t an STL container, and that doesn’t look like one. Maybe it’s an old C struct, or maybe it’s something more complicated. We’ll call this object legacyRepository, of type LegacyRepository. That’s the beast.

And legacyRepository comes with a function to add things into it:

void addInRepository(Output const& value, LegacyRepository& legacyRepository);

It doesn’t have to be of that particular form, but I’m choosing this one to illustrate, because it really doesn’t look like STL containers’ typical interface.

If we could replace the old repository by an std::vector, then we’d have used std::transform with std::back_inserter and be done with it:

std::transform(begin(inputs), end(inputs), std::back_inserter(repository), f);

But you can’t always refactor everything, and in this case we couldn’t afford to refactor this right now. So, how should we proceed?

A generalisation of std::back_inserter

I think we should take inspiration from std::back_inserter that outputs into a vector, in order to create a generalized component that can output into anything.

From this point on and until the end of this section I’m going to show you the reasoning and implementation that went into the component, custom_inserter. If you only want the resulting component then you can just hop on to the next section.

So, how does std::back_inserter works? It creates an output iterator, std::back_insert_iterator, that features the two required methods operator++ and operator*. But the real point of std::back_inserter is to take control on how the new values are assigned into the container it is linked to, and it does so with its operator=:

back_insert_iterator& operator=(T const& value)
{
    container_.push_back(value);
    return *this;
}

(This code wasn’t taken from any STL implementation, it is theoretical code to illustrate what std::back_inserter is doing.)

But then, how come it’s the operator= of std::back_insert_iterator that is called, and not the operator= of the type inside the collection? It’s because operator* doesn’t return an element of the collection, it rather keeps the control in the smart iterator:

back_insert_iterator& operator*(){ return *this; }

And operator++ must be implemented but doesn’t play a role in all this, so it is pretty much reduced to a no-op:

back_insert_iterator& operator++(){ return *this; }

This technique works well on containers that have a push_back method, but why not use the same mechanism for containers that have another interface?

custom_inserter

So let’s create our custom_insert_iterator, that, instead of taking a container, takes a custom function (or function object) to replace the call to push_back:

template<typename OutputInsertFunction>
class custom_insert_iterator
{
public:
    using iterator_category = std::output_iterator_tag;
    explicit custom_insert_iterator(OutputInsertFunction insertFunction) : insertFunction_(insertFunction) {}
    custom_insert_iterator& operator++(){ return *this; }
    custom_insert_iterator& operator*(){ return *this; }
    template<typename T>
    custom_insert_iterator& operator=(T const& value)
    {
        insertFunction_(value);
        return *this;
    }
private:
    OutputInsertFunction insertFunction_;
};

And the custom_inserter helper function to avoid to specify template parameters at call site:

template <typename OutputInsertFunction>
custom_insert_iterator<OutputInsertFunction> custom_inserter(OutputInsertFunction insertFunction)
{
    return custom_insert_iterator<OutputInsertFunction>(insertFunction);
}

Here is how we can use it:

std::copy(begin(inputs), end(inputs),
    custom_inserter([&legacyRepository](Output const& value){addInRepository(value, legacyRepository);}));

If you find this expression too cumbersome we can abstract the lambda:

auto insertInRepository(LegacyRepository& legacyRepository)
{
    return [&legacyRepository](Output const& value)
    {
        addInRepository(value, legacyRepository);
    };
}

in order to have a simpler call site:

std::transform(begin(inputs), end(inputs), custom_inserter(insertInRepository(legacyRepository)));

Couldn’t it be simpler?

As underlined by Nope in the comments section, this illustration is pretty simple and could be worked around with a simple code like:

for (const auto& input: inputs) addInRepository(f(input), lecgacyRepository);

Even though this code declares an input variable that is not necessary to express the idea of “applying f on the collection”, the above line of code is simpler than using a custom_inserter.

custom_inserter becomes really helpful to leverage on more elaborate STL algorithms, for example on the algorithms on sets:

std::set_difference(begin(inputs1), end(inputs1),
                    begin(inputs2), end(inputs2),
                    custom_inserter(insertInRepository(legacyRepository)));

Is this more or less legacy?

One could argue that we didn’t reduce the amount of legacy, because LegacyRepository hasn’t changed a bit, but a new non-standard component (or the one from Boost) has appeared on the top of it. So is it worth it?

I think we should weigh our other options in that situation. If we can get rid of the legacy and have a nice vector, or an otherwise STL-compatible interface instead (that is, that has at least a push_back method), then by all means we should do it. This way we’d have STL code all the way, and standard components to insert into the collection. This is the most desirable situation.

But if we can’t, or if it isn’t realistic on this particular piece of code (maybe it would take months or years to take down, or maybe this is an external API and we just don’t have control over it), the way I see it is that we’re facing two options: forgoing the usage of STL algorithms on this piece of code, with all the implications that we know, or using STL algorithms with our non-standard custom_inserter, which is not ideal because it is not standard, and it has a level of indirection. And next time you face this situation in your code, you’ll have to make a choice.

In all cases, custom_inserter is there for you, and don’t hesitate to give your feedback if you have any.

Related articles:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed