Jonathan Boccara's blog

Is Unzip a Special Case of Transform?

Published February 26, 2019 - 0 Comments

In the Smart Output Iterators library, the unzip output iterator allows to send the various elements contained in tuples or pairs to as many output collections:

std::vector<std::tuple<int, int, int>> lines = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {10, 11, 12} };
std::vector<int> column1, column2, column3;

std::copy(begin(lines), end(lines), fluent::output::unzip(back_inserter(column1), back_inserter(column2), back_inserter(column3)));

This is a way to transpose a collection of lines into a collection of columns. Indeed, after executing the above code, column1 contains {1, 4, 7, 10}, column2 contains {2, 5, 8, 11}, and column3 contains {3, 6, 9, 12}.

unzip also applies to maps, because they contains std::pairs of keys and values:

std::map<int, std::string> entries = { {1, "one"}, {2, "two"}, {3, "three"}, {4, "four"}, {5, "five"} };

std::vector<int> keys;
std::vector<std::string> values;

std::copy(begin(entries), end(entries), fluent::output::unzip(back_inserter(keys), back_inserter(values)));

After executing this code, keys contains {1, 2, 3, 4, 5}, and values contains {"one", "two", "three", "four", "five"}.

For more about the unzip iterator, check out its dedicated post.

The transform iterator with multiple outputs

The smart output iterators library also has a transform output iterator. Its job is to apply a function to the data it receives, and to send the result on to another iterator:

std::vector<int> input = {1, 2, 3, 4, 5};
std::vector<int> results;

auto const times2 = fluent::output::transform([](int i) { return i*2; });

std::copy(begin(input), end(input), times2(back_inserter(results)));

After this code, results contains {2, 4, 6, 8, 10}.

For more about the transform iterator and about smart output iterators in general, check out this introductory post.

More recently, we generalised the transform output iterator so that it can take several functions to apply to each element of the collection, and send their results to as many output iterators:

std::vector<int> input = {1, 2, 3, 4, 5};

auto const multiply = fluent::output::transform([](int i) { return i*2; },
                                                [](int i) { return i*3; },
                                                [](int i) { return i*4; });

std::vector<int> results1;
std::vector<int> results2;
std::vector<int> results3;

std::copy(begin(input), end(input), multiply(std::back_inserter(results1), std::back_inserter(results2), std::back_inserter(results3)));

After executing this code, expected1 contains {2, 4, 6, 8, 10}, expected2 contains {3, 6, 9, 12, 15}, and expected3 contains {4, 8, 12, 16, 20}.

Given all this, don’t you think that unzip seems like a special case of transform?

Indeed, unzip consists in applying std::get<0> on the incoming tuple or pair and sending the result to one output iterator, applying std::get<1> and sending its results to another output, applying std::get<2> and sending its result to yet another output, and so on.

It sounds as if we could implement unzip with transform, std::get and a pinch of variadic templates. Let’s try to code this up.

Implementing unzip with transform

If you look back at the first example of unzip above, you can see it is used this way:

unzip(back_inserter(column1), back_inserter(column2), back_inserter(column3))

The prototype of unzip is this:

template<typename... Iterators>
auto unzip(Iterators... iterators)
{
    //...

We need to keep this prototype, and implement the function with the transform output iterator.

To do this we need to do two things:

  • create the transform output iterator containing the functions to apply (the std::get<I>s)
  • apply it to the iterators... pack

The second one being the easiest, let’s focus on the first one: creating the transform output iterator.

As a reminder, the transform output iterator takes its functions this way:

transform([](int i) { return i*2; },
          [](int i) { return i*3; },
          [](int i) { return i*4; });

A variadic pack of integers

It would be nice to write something like transform(std::get<Is>...), but for this we need a variadic pack of Is... going from 0 to the number of elements in the Iterators... pack minus one.

The C++ standard component that creates variadic packs of consecutive integers is make_index_sequence. Let’s use it to create the pack of integers by passing it sizeof...(Iterators), which is the number of elements in the Iterators... pack:

template<size_t... Is>
auto make_transform(std::index_sequence<Is...> const&)
{
    // see below
}
    
template<typename... Iterators>
auto unzip(Iterators... iterators)
{
    return make_transform(std::make_index_sequence<sizeof...(Iterators)>{})(iterators...);
}

A better options, as suggested by Darell (who goes by the the Twitter handle of @beached_whale), is to use the more direct std::index_sequence_for:

template<typename... Iterators>
auto unzip(Iterators... iterators)
{
    return make_transform(std::index_sequence_for<Iterators...>{})(iterators...);
}

A variadic pack of std::gets

Now that we have the variadic pack of integers, we need to implement make_transform in order for it to return a transform output iterator containing the std::get<Is>.... But we can’t just write this:

template<size_t... Is>
auto make_transform(std::index_sequence<Is...> const&)
{
    return transform(std::get<Is>...);
}

Indeed, std::get<I> has 4 overloads: which deal with all four combinations of lvalue/rvalue, and const/non-const inputs. And we can’t pass an overloaded function as a parameter, because the compiler doesn’t know which overload to choose.

One way to work around this constraint is to group those functions into a function object. And while we’re at it, we can group them into a template function inside that function object too, working on any type that std::get applies to, so that would include std::pairs too.

One way would be to explicitly define a function object, such as:

template <size_t I>
struct Get
{
    template <typename Tuple>
    decltype(auto) operator()(Tuple&& tuple)
    {
        return std::get<I>(FWD(tuple));
    }
};

FWD is a useful macro I learnt from Vittorio Romeo, that alleviates the syntax of calling std::forward:

#define FWD(value) std::forward<decltype(value)>(value)

But Seph De Busser pointed out a more direct way: directly use a variadic pack of lambdas!

template<size_t... Is>
auto make_transform(std::index_sequence<Is...> const&)
{
    return transform([](auto&& tup){return std::get<Is>(FWD(tup));}...);
}

We finally put all this together to create the iterator returned by the unzip function:

template<typename... Iterators>
auto unzip(Iterators... iterators)
{
    return make_transform(std::index_sequence_for<Iterators...>{})(iterators...);
}

With this new implementation, the unit tests of unzip keep passing. Yay!

Other smart output iterators

Do you see other applications of the transform output iterator?

Can you think of other smart output iterators that would make your code simpler?

Let me know in a comment below.

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin