Jonathan Boccara's blog

Applying Several Transforms in One Pass on a Collection

Published February 22, 2019 - 0 Comments

Applying a function to each element of a collection and outputting the results into another collection is a very common thing to do, in C++ or elsewhere.

In C++, we have the std::transform algorithm to do this, a central piece of the STL algorithms library.

To illustrate, consider the following program:

#include <algorithm>
#include <iterator>
#include <vector>
#include <iostream>

int times2(int n)
{
    return n * 2;
}

int main()
{
    auto const inputs = std::vector<int>{0, 1, 2, 3, 4, 5};
    auto outputs = std::vector<int>{};
    
    std::transform(begin(inputs), end(inputs), back_inserter(outputs), times2);
    
    for (auto const& output : outputs)
    {
        std::cout << output << ' ';
    }
}

It outputs this:

0 2 4 6 8 10

The output iterator we’re using here, std::back_inserter, forwards the data it receives to the push_back method of the outputs collection.

But can we apply several functions to each elements of the collection, and output the results into several collections?

With standard algorithms, we can’t. But with smart output iterators, we could.

Smart output iterators, you said?

When we explored smart output iterators, we saw that we could write the above code differently, by pushing the logic out of the algorithm and  towards the output iterator.

The code using smart output iterators and equivalent to the previous example would be this:

std::vector<int> input = {1, 2, 3, 4, 5};
std::vector<int> results;

auto const times2 = fluent::output::transform([](int i) { return i*2; });
std::copy(begin(input), end(input), times2(back_inserter(results)));

Note that we no longer use std::transform but rather std::copy which does less things, and the logic has been transferred to times2, which is now an outputs iterator. times2 receives data from std::copy, multiplies it by 2, and sends the result on to the good old back_inserter.

This is no longer standard C++. This relies on the Smart Output Iterators library, that provides amongst other things the transform iterator. For more details about smart outputs iterators, you can check out the library, or this introductory blog post.

The characteristic aspect of smart outputs iterators are their position: in the output of the algorithm. Let’s take advantage of their position to do something that an algorithm can’t do: applying several functions on the same collection.

Applying several functions to the elements of a collection

This is something that happens in our everyday programming life: you have several functions, and you’d like to apply each of them to the elements of your collection.

Let’s enrich the transform output iterator so that it supports more than one function. For example, we’d like to be able to write code like this:

std::vector<int> input = {0, 1, 2, 3, 4, 5};

auto const times234 = fluent::output::transform([](int i) { return i*2; },
                                                [](int i) { return i*3; },
                                                [](int i) { return i*4; });

std::vector<int> results1;
std::vector<int> results2;
std::vector<int> results3;

std::copy(begin(input), end(input),
          times234(back_inserter(results1),
                   back_inserter(results2),
                   back_inserter(results3)));

This would apply each of the 3 functions defined in the output iterators to each of the elements of the collections, and dispatch the results in 3 corresponding collections (results1, results2, results3).

So if we print out the contents of the output collections, for example with this code:

for (auto const& result : results1) { std::cout << result << ' '; }
std::cout << '\n';
for (auto const& result : results2) { std::cout << result << ' '; }
std::cout << '\n';
for (auto const& result : results3) { std::cout << result << ' '; }
std::cout << '\n';

We’d like it to display this output:

0 2 4 6 8 10
0 3 6 9 12 15
0 4 8 12 16 20

Can we do this? Yes we can, and we’ll see the implementation in just a moment.

But before that, let’s reflect on the interest of this feature. Let’s compare the code using standard algorithms to achieve the same thing:

std::vector<int> input = {0, 1, 2, 3, 4, 5};

std::vector<int> results1;
std::vector<int> results2;
std::vector<int> results3;

std::transform(begin(input), end(input), back_inserter(results1), [](int i) { return i*2; });
std::transform(begin(input), end(input), back_inserter(results2), [](int i) { return i*3; });
std::transform(begin(input), end(input), back_inserter(results3), [](int i) { return i*4; });

This code can be seen as more straightforward that the one above using smart output iterators because it just repeats the same pattern. And it can also be seen as less straightforward because it makes several passes on the same collection, whereas the one using smart output iterators only makes one pass.

The interest of using smart output iterators becomes even clearer when there is more than just applying a function. If you’d like to use filters, for example (or any other output iterator in the library, including applying other functions with the transform iterator), the code using smart output iterators would look like this:

std::copy(begin(input), end(input),
          times234(aFilter(back_inserter(results1)),
                   back_inserter(results2),
                   anotherFilter(back_inserter(results3))));

Whereas using the standard algorithms doesn’t scale well:

std::transform(begin(input), end(input), back_inserter(notFilteredResults1), [](int i) { return i*2; });
std::copy_if(begin(notFilteredResults1), end(notFilteredResults1), back_inserter(results1), aFilter);
std::transform(begin(input), end(input), back_inserter(results2), [](int i) { return i*3; });
std::transform(begin(input), end(input), back_inserter(notFilteredResults3), [](int i) { return i*4; });
std::copy_if(begin(notFilteredResults3), end(notFilteredResults3), back_inserter(results3), anotherFilter);                   

Let’s now implement the possibility for the transform output iterator to have multiple outputs.

Implementing the multiple transform output iterator

We’ll pick up where we left off in the introductory blog post: we have a transform output iterator that already support one output:

template<typename Iterator, typename TransformFunction>
class output_transform_iterator
{
public:
    using iterator_category = std::output_iterator_tag;
 
    explicit output_transform_iterator(Iterator iterator, TransformFunction transformFunction) : iterator_(iterator), transformFunction_(transformFunction) {}
    output_transform_iterator& operator++(){ ++iterator_; return *this; }
    output_transform_iterator& operator++(int){ ++*this; return *this; }
    output_transform_iterator& operator*(){ return *this; }
    template<typename T>
    output_transform_iterator& operator=(T const& value)
    {
        *iterator_ = transformFunction_(value);
        return *this;
    }
private:
    Iterator iterator_;
    TransformFunction transformFunction_;
};

The iterator contains two things:

  • another iterator, to which it sends its results (for example it can be a back_inserter),
  • the function to apply (which can also be a lambda–it is defined as a template parameter).

To have several outputs, the iterator must now contains:

  • a collection of iterators to send results to,
  • a collection of functions to apply.

And we must fit all this in the template parameter. The template parameters for one output look like this:

template<typename Iterator, typename TransformFunction>

It would be nice to be able to write then:

template<typename... Iterators, typename... TransformFunctions>

But we can’t: C++ requires that the template parameters variadic pack be at the end of the template parameters (and as a result, there can be only one variadic pack).

To work around this constraint, we can pack up one group of parameters into one parameter, by using a tuple. Let’s make this appear in its name:

template<typename TransformFunctionTuple, typename... Iterators>

We choose to pack up the functions, because it will make the implementation of other parts of the iterator easier.

As a result, the data members of the iterator, that used to be those:

    Iterator iterator_;
    TransformFunction transformFunction_;

Now become these:

    std::tuple<Iterators...> iterators_;
    TransformFunctionTuple transformFunctionTuple_;

And we expect TransformFunctionTuple to be a std::tuple of functions and/or lambdas to apply.

We now have to apply each function to the value incoming in operator=, and send the result to the corresponding output iterator.

For this we need to be able to apply a function to the elements of two tuples. We already came across this need in the past, when implementing the unzip output iterator. We came up then with the apply2 function. You can check out the details of its implementation there.

By using apply2, the implementation of operator= goes from this:

    *iterator_ = transformFunction_(value);

To this:

   apply2([&value](auto&& function, auto&& iterator){ *iterator = function(value); },
           transformFunctionTuple_,
           iterators_);

The rest of the adaptation consists in passing on the variadic template parameters from the transform function that creates the output iterator to the actual iterator class above output_transform_iterator. They don’t contain any specific difficulty and you can see them in the commit introducing the feature in the library.

A new range of possibilities

This feature of outputting the results of several functions to several outputs seems like an important addition to the smart output iterators library.

For example, the unzip output iterator, that takes a tuple (or a pair) and sends its various pieces to as many output collections sounds like an application of our new transform iterator. The functions to apply would be the std::get<N> functions (or .first and .second for the std::pair).

To explore this in more detail, in the next post we will try to implement the unzip output iterator with the transform output iterator.

Stay tuned!

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin