Jonathan Boccara's blog

How to Generate All the Combinations from Several Collections

Published March 18, 2022 - 0 Comments

Generating all the possible combinations from a set of collections and applying a function to each combination is a need that comes up often in programming.

This is called a “Cartesian product”.

For example, this kind of operation is necessary in the cartesian_product range adaptor, in the cartesian_product pipe, and in the killer feature of verifyAllCombinations in the ApprovalTest.cpp library, to cite just a few.

The most basic usage of a Cartesian product looks like this:

auto const inputs1 = std::vector<int> {1, 2, 3};
auto const inputs2 = std::vector<std::string>{"up", "down"};
auto const inputs3 = std::vector<std::string>{"blue", "red"};

cartesian_product(displayCombination,
                  inputs1,
                  inputs2,
                  inputs3);

Where displayCombination is a function that takes one element from each collection:

void displayCombination(int input1, std::string const& input2, std::string const& input3)
{
    std::cout << input1 << '-' << input2 << '-' << input3 << '\n';
}

The above code generates all the possible combinations of the elements coming from the three input collections and sends each combination to displayCombination. The output of this program is then:

1-up-blue
1-up-red
1-down-blue
1-down-red
2-up-blue
2-up-red
2-down-blue
2-down-red
3-up-blue
3-up-red
3-down-blue
3-down-red

A few days ago I published a blog post that suggested that you try and code it up yourself. It’s not too late to try! Indeed, implementing cartesian_product is a very instructive experiment.

Let’s see one way of implementing such a cartesian_product in C++.

The main loop

The main body of the cartesian_product function consists in iterating over all the combinations of the elements coming from the input ranges, and sending each of those combinations to the incoming function.

In pseudo-code, that loops looks like this:

template<typename Function, typename... Ranges>
void cartesian_product (Function function, Ranges const&... ranges)
{
    for(combination = first combination;
        we finished iterating;
        go to next combination)
    {
        call function on that combination
    }
}

Our goal is now to transform each of those bits of pseudo-code into real C++ code.

If there were only one collection, the above code would have used an iterator on that collection. To generalise from this, we can use a tuple of iterators: each element of the tuple contains an iterator to an element of one of the input ranges.

We then need to be able to:

  • instantiate this tuple of iterators with the first combination,
  • call the incoming function on the current combination
  • make it advance to the next combination,
  • identify when we’ve been over all the combinations.

By implementing those 4 steps, we’ll be able to flesh out the above pseudo-code into compiling C++ code.

Instantiating the first combination

The first combination is probably the easiest one to create: just take an iterator to the first element of each collection:

template<typename Function, typename... Ranges>
void cartesian_product (Function function, Ranges const&... ranges)
{
    auto const beginIterators = std::make_tuple(begin(ranges)...);

    for(auto combination = beginIterators;
        we finished iterating;
        go to next combination)
    {
        call function on that combination
    }
}

Calling the function on a combination

Now we have a tuple of iterators that represents a combination. But the incoming function doesn’t take iterators as parameters, much less a tuple of them.

We therefore need to do two things: break up the tuple in individual elements, and dereference each of those elements.

We’ll do this in the opposite order: we will first create a tuple of references to the elements (as opposed to iterators), then break up this tuple to send individual parameters to the incoming function.

Creating a tuple of references

Creating a tuple of references out of a tuple of iterators consists in applying a function (here, operator*) on each element. This sounds like a std::transform but for tuples.

To achieve that we can use one of our algorithm on tuples:

template<typename... Ts>
auto dereference(std::tuple<Ts...> const& tuple)
{
    return tuple_algos::transform(tuple, [](auto&& element) -> decltype(auto) { return *element; });
}

We need to be explicit about the return type of the lambda by using decltype(auto), because the default type would have returned a copy and not a reference to the element referenced by the iterator. For more about the difference between auto and decltype(auto), read item 3 of Effective Modern C++.

Breaking up a tuple into individual function parameters

Now that we have a tuple of references, we have to pass each of them as an argument to the incoming function.

This is exactly what std::apply does:

template<typename Function, typename... Ranges>
void cartesian_product (Function function, Ranges const&... ranges)
{
    auto const beginIterators = std::make_tuple(begin(ranges)...);

    for(auto combination = beginIterators;
        we finished iterating;
        go to next combination)
    {
        std::apply(function, dereference(combination));
    }
}

std::apply comes with C++17. If you don’t have access to C++17 yet, at the end of this post I will point you to adapted implementations of cartesian_product for C++11 and C++14.

We’re now left with the iteration itself.

Generating the next collection

If we have a given combination, what should the next one be?

The way that sounds the most natural to be it to iterate over combinations in a lexicographical order:

  • 1) increment the last iterator until reaching the end of the last collection,
  • 2) when we reach the end of the last collection, increment the iterator of the collection before last, and reset the iterator of the last collection to its beginning,
  • 3) repeat the two previous steps, until reaching the end of the collection before last,
  • then increment the iterators of the collection before the one that is before last,
  • repeat the previous steps,
  • and so on.

Let’s implement this recursive definition of the traversal of the collection.

To start, let’s implement the general step of incrementing the iterator of the I-th collection (the calling code with call this with I = N-1 to increment the last iterator, as in the algorithm described above):

template<size_t I, typename... Iterators>
void increment_iterator(std::tuple<Iterators...>& iterators,
                        std::tuple<Iterators...> const& beginIterators,
                        std::tuple<Iterators...> const& endIterators)
{
    auto& it = std::get<I>(iterators);
    auto const begin = std::get<I>(beginIterators);
    auto const end = std::get<I>(endIterators);
    
    ++it; // step 1) of the algorithm above
    
    if (it == end)
    {
        it = begin; // step 2) of the algorithm above
        increment_iterator<I - 1>::_(iterators, beginIterators, endIterators);  // step 3) of the algorithm above
    }
}

The comments in the above snippet are not to be kept in the code, they just indicate which lines of code correspond to the steps listed in the recursive algorithms we described earlier.

We then need to implement the final step of the algorithm: incrementing the iterator in the first collection.

To do that, we need to specialise the above template for I = 0, to just increment the first iterator.

This would be a partial specialisation of the template, because the Iterators... types would still be template parameters. But since we can’t partially specialise template functions, we need to do the usual trick of wrapping them as static function in a template struct.

The whole code of increment_iterator then becomes:

template<size_t I>
struct increment_iterator
{
    template<typename... Iterators>
    static void _(std::tuple<Iterators...>& iterators, std::tuple<Iterators...> const& beginIterators, std::tuple<Iterators...> const& endIterators)
    {
        auto& it = std::get<I>(iterators);
        auto const begin = std::get<I>(beginIterators);
        auto const end = std::get<I>(endIterators);
        
        ++it;
        
        if (it == end)
        {
            it = begin;
            increment_iterator<I - 1>::_(iterators, beginIterators, endIterators);
        }
    }
};

template<>
struct increment_iterator<0>
{
    template<typename... Iterators>
    static void _(std::tuple<Iterators...>& iterators, std::tuple<Iterators...> const&, std::tuple<Iterators...> const&)
    {
        auto& it = std::get<0>(iterators);
        
        ++it;
    }
};

We can now use increment_iterator to generate the next combination:

template<typename... Iterators>
void next_combination(std::tuple<Iterators...>& iterators,
                      std::tuple<Iterators...> const& beginIterators,
                      std::tuple<Iterators...> const& endIterators)
{
    constexpr auto N = sizeof...(Iterators);
    increment_iterator<N - 1>::_(iterators, beginIterators, endIterators);
}

Finally, we can use this in our main loop:

template<typename Function, typename... Ranges>
void cartesian_product (Function function, Ranges const&... ranges)
{
    auto const beginIterators = std::make_tuple(begin(ranges)...);
    auto const endIterators = std::make_tuple(end(ranges)...);

    for(auto combination = beginIterators;
        we finished iterating;
        next_combination(combination, beginIterators, endIterators))
    {
        std::apply(function, dereference(combination));
    }
}

This was the hardest part! The only step left is to know when to stop incrementing the iterators of the current combination.

Identifying the end of the combinations

Given our above way of going to the next permutation, we reach the last permutation when we reach the end of the first collection.

This make the stopping condition pretty simple:

template<typename Function, typename... Ranges>
void cartesian_product (Function function, Ranges const&... ranges)
{
    auto const beginIterators = std::make_tuple(begin(ranges)...);
    auto const endIterators = std::make_tuple(end(ranges)...);

    for(auto combination = beginIterators;
        std::get<0>(combination) != std::get<0>(endIterators);
        next_combination(combination, beginIterators, endIterators))
    {
        std::apply(function, dereference(combination));
    }
}

The case of en empty collection

There is at least one case that the above code does not cover: the case where there is an empty collection.

Indeed, with an empty collection, we should not dereference the iterator coming from that collection. What to do then?

Let’s go back to the need: what does it mean to generate all possible combinations of the elements of several collections when one if them is empty? It means: to do nothing at all!

For this reason, we can check that all collections contain data before starting the loop, in order to avoid dereferencing iterators that don’t reference data, which would make the application crash.

To do this we can use our any_of algorithm on tuple (while we’re at it, let’s also static_assert that there is more than one incoming range):

template<typename Function, typename... Ranges>
void cartesian_product (Function function, Ranges const&... ranges)
{
    static_assert(sizeof...(Ranges) > 0, "There should be at least one range in cartesian_product.");
    auto const hasEmptyRange = tuple_algos::any_of(std::forward_as_tuple(ranges...), [](auto&& range){ return range.size() == 0; });

    if (!hasEmptyRange)
    {
        auto const beginIterators = std::make_tuple(begin(ranges)...);
        auto const endIterators = std::make_tuple(end(ranges)...);
        
        for (auto combination = beginIterators; std::get<0>(combination) != std::get<0>(endIterators); next_combination(combination, beginIterators, endIterators))
        {
            std::apply(function, dereference(combination));
        }
    }
}

Here is all the code put together.

Not yet at the latest version of C++?

The above code uses several features of C++14 (auto in lambda parameters), and C++17 (std::apply).

If you’re still in the process of upgrading to the latest and greatest version of C++ but are not quite there yet (a lot of companies are like you!) then you can find a version of this code using only C++14, and one using only C++11 (thanks mika-fischer).

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin