Jonathan Boccara's blog

Smart Output Iterators: A Symmetrical Approach to Range Adaptors

Published November 28, 2017 - 2 Comments

Smart output iterators

Some of the algorithms of the STL have a structure in common: they take one or more ranges in input, do something more or less elaborate with them, and produce an output in a destination range.

For example, std::copy merely copies the inputs to the outputs, std::transform applies a function onto the inputs and sends the results as outputs, and std::set_difference takes two input ranges and outputs to a destination range the elements that are in the first one but not in the second.

There are several ways to express this kind of input-operation-output structure on ranges in C++. To illustrate them, let’s take the example of std::transform since it is such a central algorithm in the STL.

To make the code examples lighter, let’s suppose that we have some modified versions of STL algorithms that take an input range instead of two iterators, for instance:

namespace ranges
{
template <typename Range, typename OutputIterator>
OutputIterator copy(Range const& range, OutputIterator out)
{
    return std::copy(range.begin(), range.end(), out);
}
}

and so on for other algorithms.

Various places to put the logic

The standard way to apply a function to each element and have the results added to a collection is to combine the std::transform algorithm with an output iterator such as std::back_inserter:

// f is a function to apply to each element of the collection
int f(std::string const& s);

std::vector<std::string> strings = { "So", "long", "and", "thanks", "for", "all", "the", "fish" };
std::vector<int> results;

ranges::transform(strings, std::back_inserter(results), f);

A more modern way, which logic we saw in Ranges: the STL to the Next Level, is to use ranges and range adaptors:

// f is a function to apply to each element of the collection
int f(std::string const& s);

std::vector<std::string> strings = { "So", "long", "and", "thanks", "for", "all", "the", "fish" };
std::vector<int> results;

ranges::copy(strings | ranges::view::transform(f), std::back_inserter(results));

We could even do away with the back_inserter here by using the push_back free function, but let’s keep it generic to take account of the case of sending outputs to a stream for example.

One interesting thing to note here is that the main action of the whole operation, which is applying the function f, has been transferred to the input range: strings | ranges::view::transform, taking this responsibility away from the algorithm. The algorithm then becomes simpler, becoming copy instead of transform.

When we see it from this perspective, we can see another way of structuring the operation. One that gets less publicity than the other ones, but that can have several advantages as we’ll see in just a moment: shifting the logic to the output iterator:

// f is a function to apply to each element of the collection
int f(std::string const& s);

std::vector<std::string> strings = { "So", "long", "and", "thanks", "for", "all", "the", "fish" };
std::vector<int> results;

ranges::copy(strings, transform_f(std::back_inserter(results)));

where transform_f is an output iterator that applies f and forwards this result to the std::back_inserter.

Note that with this approach the input range is simple (strings), the algorithm is simple too (ranges::copy) and the responsibility of applying f has been moved over to the output iterator.

Is this form helpful at all?

The case for smart output iterators

Let’s take a case where standard algorithms aren’t practical to use: the case of “transform if” for example. This is a case where we’d like to apply a function to only the elements of a collection that satisfy a predicate. It is cumbersome to do with the STL because STL algorithms don’t chain up well:

int f(int);

std::vector<int> numbers = {1, 2, 3, 4, 5};
std::vector<int> evenNumbers;
copy_if(numbers, std::back_inserter(evenNumbers), isEven);
std::vector<int> results;
transform(evenNumbers, std::back_inserter(results), f);

So let’s say that the first way using STL algorithms is out. We’re left with two options:

  • using ranges:
int f(int);

std::vector<int> numbers = {1, 2, 3, 4, 5};
std::vector<int> results;

ranges::copy(numbers | ranges::view::filter(isEven) | ranges::view::transform(f), std::back_inserter(results);
  • using smart output iterators:
int f(int);

std::vector<int> numbers = {1, 2, 3, 4, 5};
std::vector<int> results;

ranges::copy(numbers, filter_even(transform_f(std::back_inserter(results))));

Smarter output iterators

Ranges are more and more the default solution in this case, and the direction that the STL is taking for the future. However, there are several reasons why it can be interesting to consider giving some responsibility to output iterators.

The first reason is that for the algorithms taking more than one range in input, for example std::set_difference and the other algorithms on sets, to my knowledge you can’t use traditional range adaptors to apply a transformation to the outputs of the algorithms. Indeed, ranges adaptors could modify either one or both of the input ranges:

set_difference(range1 | adaptor1,
               range2 | adaptor2,
               outputIterator);

But how could they apply a transformation on the outputs of the algorithms before sending them to the outputIterator, like a smart output iterator would do?

EDIT: in fact, the STL algorithms on sets are not such a good example of absolute necessity for smart output iterators, since range-v3 turns out to have view adaptors on sets algorithms. But there are still other cases where they are necessary, for instance algorithms that have several outputs. The STL only has std::partition_copy, but it’s very useful to extend the STL with more elaborate algorithms such as set_segregate, which has multiple outputs. In this case, smart output iterators become very handy.

A second reason is that smart output iterators could better express that some transformations are not semantically related to the algorithm, but rather to how the output collection stores its elements. To illustrate, let’s consider the case where the output container stores BigInts instead of ints. And this BigInt class doesn’t allow implicit conversion because its designer was wary of implicit conversions.

So our function f here would convert an int into a BigInt, simply by calling its constructor:

BigInt make_bigint(int i)
{
    return BigInt(i);
}

In this case, when reading the code we don’t really care about the fact that f is called. It has to be there, otherwise the code wouldn’t compile, but the meaningful part in the code is arguably the application of the predicate isEven. Shifting this application of f to the output iterator is a way to convey this message: this is just to make the outputs fit into the output container, much like std::back_inserter is.

So we could delegate the responsibility of the conversion to the output iterator side and mix both ranges and output iterators:

int f(int);

std::vector<int> numbers = {1, 2, 3, 4, 5};
std::vector<BigInt> results;

ranges::copy(numbers | ranges::view::filter(isEven),
             bigint_convert(std::back_inserter(results)));

or we could just use the STL algorithm, here copy_if:

int f(int);

std::vector<int> numbers = {1, 2, 3, 4, 5};
std::vector<BigInt> results;

ranges::copy_if(numbers,
                bigint_convert(std::back_inserter(results)),
                isEven);

Another reason is a very practical one: smart output iterators are light-weight components that are relatively easy and quick to implement (much easier than ranges, I’ve tried to implement both) even in C++03. We see an example of that in the next section. So if you don’t have access to Boost Ranges or range-v3, they can be a practical way to make your code more concise. We’ll see an implementation in the next section of this article.

Finally, a last reason to consider smart output iterators is that they are a different way to go about structuring the call to an algorithm. And just for that reason, they can expand our view and give us more perspective on the topic of applying algorithms!

Implementing smart output iterators

To follow up on the above example with BigInt, let’s make a generic output iterator that takes a function, applies it to the value it receives, and sends the result on to the iterator that it wraps (a std::back_inserter for example).

Here is a complete implementation, that we detail bit by bit just after:

template<typename Iterator, typename TransformFunction>
class output_transform_iterator
{
public:
    using iterator_category = std::output_iterator_tag;

    explicit output_transform_iterator(Iterator iterator, TransformFunction transformFunction) : iterator_(iterator), transformFunction_(transformFunction) {}
    output_transform_iterator& operator++(){ ++iterator_; return *this; }
    output_transform_iterator& operator++(int){ ++*this; return *this; }
    output_transform_iterator& operator*(){ return *this; }
    template<typename T>
    output_transform_iterator& operator=(T const& value)
    {
        *iterator_ = transformFunction_(value);
        return *this;
    }
private:
    Iterator iterator_;
    TransformFunction transformFunction_;
};

template<typename TransformFunction>
class output_transformer
{
public:
    explicit output_transformer(TransformFunction transformFunction) : transformFunction_(transformFunction) {}
    template<typename Iterator>
    output_transform_iterator<Iterator, TransformFunction> operator()(Iterator iterator) const
    {
        return output_transform_iterator<Iterator, TransformFunction>(iterator, transformFunction_);
    }
    
private:
    TransformFunction transformFunction_;
};

template<typename TransformFunction>
output_transformer<TransformFunction> make_output_transformer(TransformFunction transformFunction)
{
    return output_transformer<TransformFunction>(transformFunction);
}

Here is how this code works:

The generic elements of the smart iterator are:

  • the function to apply,
  • the iterator it wraps.

So let’s makes these two template parameters:

template<typename Iterator, typename TransformFunction>
class output_transform_iterator

Let’s accept those two parameters in the constructor and store them in our smart iterator:

    output_transform_iterator(Iterator iterator, TransformFunction transformFunction) : iterator_(iterator), transformFunction_(transformFunction) {}

private:
    Iterator iterator_;
    TransformFunction transformFunction_;

We need to implement the operators of an output iterator: operator++ advances the underlying iterator. Advancing the underlying iterator is a no-op in std::back_inserter, but is necessary if the underlying output iterator is the begin of a container for example.

output_transform_iterator& operator++(){ ++iterator_; return *this; }

And like for std::back_inserter and custom_inserter, we use operator* to return the iterator itself and keep control of operator= to apply the function and pass the result to the underlying iterator:

output_transform_iterator& operator*(){ return *this; }
template<typename T>
output_transform_iterator& operator=(T const& value)
{
    *iterator_ = transformFunction_(value);
    return *this;
}

That’s about it, except that the interface isn’t quite right: we would like an iterator that wraps over another iterator, and not one that also takes a function in its constructor:

bigint_convert(std::back_inserter(results))

Said differently, we’d like to partially apply the constructor with the transform function, here make_bigint, retrieve the object and give it an underlying iterator at a later time.

To simulate partial function application of a function in C++, we can use a function object:

template<typename TransformFunction>
class output_transformer
{
public:
    explicit output_transformer(TransformFunction transformFunction) : transformFunction_(transformFunction) {}
    template<typename Iterator>
    output_transform_iterator<Iterator, TransformFunction> operator()(Iterator iterator) const
    {
        return output_transform_iterator<Iterator, TransformFunction>(iterator, transformFunction_);
    }
    
private:
    TransformFunction transformFunction_;
};

Indeed, the parameters are applied in two phases: the first one in the constructor and the second one in the operator().

Finally, to create a transformer we use a helper function to deduce the template parameter of the transform function:

template<typename TransformFunction>
output_transformer<TransformFunction> make_output_transformer(TransformFunction transformFunction)
{
    return output_transformer<TransformFunction>(transformFunction);
}

This implementation is compatible with C++03 (and I didn’t see how to use lambdas to make it clearer anyway). Note though that in C++17 we wouldn’t need the make_output_transformer function thanks to the type deduction in class template constructors.

Sweeping low-level operations under the rug

By using the smart output iterator we can now make the conversion to BigInt more discrete at the call site:

//C++03
output_transformer<BigInt(*)(int)> const bigint_converter = make_output_transformer(make_bigint);

//C++11
auto const bigint_converter = make_output_transformer(make_bigint);

//C++17
auto const bigint_converter = output_transformer(make_bigint);

int f(int);

//Call site
std::vector<int> numbers = {1, 2, 3, 4, 5};
std::vector<BigInt> results;

ranges::copy(numbers | ranges::view::filter(isEven),
             bigint_convert(std::back_inserter(results)));

Will smart output iterators compete with ranges on all use cases? Certainly not. But to express that an operation is more closely related to the output container than to the algorithm itself, they can constitute an alternative worth having in our toolbox.

output_transformer and other smart output iterators are available in the smart-output-iterators GitHub repository.

Related articles:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed