Jonathan Boccara's blog

Algorithms on Sets That Return a Boolean: Strong Templates Interface

Published August 21, 2020 - 0 Comments

In the last episode of our series on sets, we have designed set_bool_information, a generic algorithm that provides multiple ways to compare two sets.

Even if our initial implementation does the job, its interface takes several similar parameters, and it is not clear at call site which means what. To make the call site more expressive, we will improve the interface with “strong templates”, which allow to express the role of each type at call site.

Finally, we will write the implementation of the various algorithms.

This post is part of the series on algorithms on sets, that now contains:

  • How to check if 2 sorted collections have a common element
  • set_match: Matching up Elements Between Sorted Collections
  • Implementing set_match in One Line of Code
  • STL algorithms on sets: one algorithm to implement them all
  • Algorithms on set returning a boolean: exploring the algorithms
  • Algorithms on set returning a boolean: implementing a generic algorithm
  • Algorithms on set returning a boolean: a strong template interface
  • NWH: Adapting algoritms on sets

Previously, in the algorithms on sets

In case you’re just joining the series, or need a refresher, here is a brief recap of the previous posts of the series. For more details, check out the individual posts indicated above. Otherwise, you can skip to the next section.

All the algorithms on sets have the same basic structure:

template <typename SetA, typename SetB, typename Compare>
bool algo(SetA&& setA, SetB&& setB, Compare comp)
{
    auto xA = setA.begin();
    auto xB = setB.begin();
    while (xA != setA.end() && xB != setB.end())
    {
        if (comp(*xA, *xB))
        {
            1st customisation point
        }
        else if (comp(*xB, *xA))
        {
            2nd customisation point
        }
        else
        {
            3rd customisation point
        }
    }
    4th customisation point
}

To compare sets and return booleans, the customisation points consist in moving on inside of the sets, or returning something from the function. We wrote the algorithm set_bool_information to express that:

template<typename SetA, typename SetB, typename Compare,
         typename PredicateFirstLessThanSecond,
         typename PredicateSecondLessThanFirst,
         typename PredicateBothEquivalent,
         typename PredicateFinishedTraversal>
bool set_bool_information(SetA&& setA,
                          SetB&& setB,
                          Compare comp,
                          PredicateFirstLessThanSecond,
                          PredicateSecondLessThanFirst,
                          PredicateBothEquivalent,
                          PredicateFinishedTraversal)
{
    auto xA = begin(setA);
    auto xB = begin(setB);
    
    while (xA != end(setA) && xB != end(setB))
    {
        if (comp(*xA, *xB))
        {
            if constexpr (std::is_same_v<PredicateFirstLessThanSecond, MoveOn>)
            {
                ++xA;
            }
            else
            {
                return PredicateFirstLessThanSecond::_();
            }
        }
        else if constexpr (comp(*xB, *xA))
        {
            if (std::is_same_v<PredicateSecondLessThanFirst, MoveOn>)
            {
                ++xB;
            }
            else
            {
                return PredicateSecondLessThanFirst::_();
            }
        }
        else
        {
            if constexpr (std::is_same_v<PredicateBothEquivalent, MoveOn>)
            {
                ++xA;
                ++xB;
            }
            else
            {
                return PredicateBothEquivalent::_();
            }
        }
    }
    return PredicateFinishedTraversal::_(xA, end(setA), xB, end(setB));
}

This function takes the two sets to compare, the comparison function (which is by default operator<), and various values associated to the customisation points.

The customisation values are predicates, returning booleans. For example, this is a customisation value that always returns true:

struct ReturnTrue
{
    static bool _() { return true; }

    template<typename Iterator1, typename End1, typename Iterator2, typename End2>
    static bool _(Iterator1&&, End1&&, Iterator2&&, End2&&)
    {
        return true;
    }
};

We had also seen that we could implement various algorithms by passing in various combinations of customisation points.

For example, share_element, that checks in linear time if two sorted collections (a.k.a sets) have at least one element in common, can be implemented like this:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool share_element(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                MoveOn{},
                                MoveOn{},
                                ReturnTrue{},
                                ReturnFalse{});
}

(FWD(x) is one of the rare useful macros, expanding to std::forward<decltype(x)>(x)).

But the problem in this code is that its is not clear what the 4 customisation values passed correspond to.

It would be nicer to have something looking like this:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool is_prefix_of_other(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    // imaginary C++:
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                when the first is less than second: MoveOn,
                                when the second is less than first: MoveOn,
                                when both are equivalent: ReturnTrue,
                                when we finish the traversal: ReturnFalse);
}

With strong templates, we’ll try to get as close as possible to the above desired code.

Strong templates

The idea behind strong templates is to wrap a type as a template parameter of another type in order to give it a specific meaning.

I call them strong templates because they’re the template equivalent of strong types, which are types that wrap other types to give them a specific meaning too.

There are several kinds of strong templates. For our purpose, we can use an alias in a template type:

template<typename T>
struct FirstLessThanSecond
{
    using Predicate = T;
};

template<typename T>
struct SecondLessThanFirst
{
    using Predicate = T;
};

template<typename T>
struct BothEquivalent
{
    using Predicate = T;
};

template<typename T>
struct FinishedTraversal
{
    using Predicate = T;
};

Those types are “carrying” an underlying type–and everything is happening at compile time.

Improving the call site

Let’s rewrite our call site now, by using those strong templates:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool set_share_element(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{

    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<MoveOn>{},
                                SecondLessThanFirst<MoveOn>{},
                                BothEquivalent<ReturnTrue>{},
                                FinishedTraversal<ReturnFalse>{});
}

This new interface brings two advantages.

The first one is that it’s more expressive for humans: we can read and understand the role of each parameter at call site.

The second one is that it’s more expressive for the compiler too: by stating our intentions, the compiler can stop us when we accidentally don’t respect them.

To illustrate this, consider the case where we swapped the first two parameters by mistake:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool set_share_element(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{

    return set_bool_information(FWD(set1), FWD(set2), comp,
                                SecondLessThanFirst<MoveOn>{}, // these two are in
                                FirstLessThanSecond<MoveOn>{}, // the wrong order
                                BothEquivalent<ReturnTrue>{},
                                FinishedTraversal<ReturnFalse>{});
}

Then the code no longer compiles. Indeed, the function expects a FirstLessThanSecond where it gets a SecondLessThanFirst, and conversely.

Implementing the algorithms

With all this under our belt, we can implement the 8 algorithms we came up with by exploring the various combinations of the customisation points:

  • std::includes
  • share_element
  • is_prefix_of
  • is_one_prefix_of_other
  • equivalent
  • disjoint
  • is_before
  • is_after

std::includes

std::includes is a standard algorithm provided with the STL, but we can implement it with set_bool_information:

template <typename Set1, typename Set2, typename Compare = std::less<>
bool includes(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<MoveOn>{},
                                SecondLessThanFirst<ReturnFalse>{},
                                BothEquivalent<MoveOn>{},
                                FinishedTraversal<ReturnHasReachedEndOfSecond>{});
}

share_element

Here is the implementation for share_element:

template <typename Set1, typename Set2, typename Compare = std::less<>
bool set_share_element(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<MoveOn>{},
                                SecondLessThanFirst<MoveOn>{},
                                BothEquivalent<ReturnTrue>{},
                                FinishedTraversal<ReturnFalse>{});
}

is_prefix_of

Here is the implementation of is_prefix_of:

template <typename Set1, typename Set2, typename Compare = std::less<>
bool is_prefix_of(Set1&& set1, Set2&& set2, Compare&& comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<ReturnFalse>{},
                                SecondLessThanFirst<ReturnFalse>{},
                                BothEquivalent<MoveOn>{},
                                FinishedTraversal<ReturnHasReachedEndOfFirst>{});
}

is_one_prefix_of_other

Here is the implementation of is_one_prefix_of_other, that checks if either set is a prefix of the other one:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool is_prefix_of_other(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<MoveOn>{},
                                SecondLessThanFirst<ReturnFalse>{},
                                BothEquivalent<MoveOn>{},
                                FinishedTraversal<ReturnTrue>{});
}

equivalent

Here is the implementation of equivalent, that checks that the two sets contains equivalent elements:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool equivalent(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<ReturnFalse>{},
                                SecondLessThanFirst<ReturnFalse>{},
                                BothEquivalent<MoveOn>{},
                                FinishedTraversal<ReturnHasReachedEndOfBoth>{});
}

disjoint

There are several possible implementations for disjoint, that checks if the two sets have no elements in common. The first one is in the same style as the previous ones:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool equivalent(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<MoveOn>{},
                                SecondLessThanFirst<MoveOn>{},
                                BothEquivalent<ReturnFalse>{},
                                FinishedTraversal<ReturnTrue>{});
}

But we can also notice that disjoint is the opposite of share_element. Indeed, two sets are disjoints means that they don’t have any element in common:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool disjoint(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return !set_share_element(std::forward<Set1>(set1), std::forward<Set2>(set2), comp);
}

is_before

is_before checks if all the elements of the first set are smaller than the smallest of the elements of the second set.

To implement this algorithm, we need an extra step: handling the case where the second set is empty, which means that it doesn’t have a smallest element.

In that case, we decide by convention that the empty set is_before any other set, and that no set is_before the empty set:

template <typename Set1, typename Set2, typename Compare = std::less<>
bool is_before(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    if (begin(set2) == end(set2)) return false;
    
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<MoveOn>{},
                                SecondLessThanFirst<ReturnFalse>{},
                                BothEquivalent<ReturnFalse>{},
                                FinishedTraversal<ReturnTrue>{});
}

is_after

is_after checks if all the elements of the second set are smaller than the smallest of the elements of the first set.

A possible implementation is this:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool is_after(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    if (begin(set1) == end(set1)) return false;
    
    return set_bool_information(FWD(set1), FWD(set2), comp,
                                FirstLessThanSecond<ReturnFalse>{},
                                SecondLessThanFirst<MoveOn>{},
                                BothEquivalent<ReturnFalse>{},
                                FinishedTraversal<ReturnTrue>{});
}

Note that is_after is not the negation of is_before. Indeed, it is possible that two sets are neither before nor after one another (if they have intertwined elements).

However, we can implement is_after by inverting the elements of is_before:

template <typename Set1, typename Set2, typename Compare = std::less<>>
bool is_after(Set1&& set1, Set2&& set2, Compare comp = std::less<>{})
{
    return is_before(FWD(set2), FWD(set1), comp);
}

A generic algorithm

Over the couple of last posts, we’ve built set_bool_information to provide a generic algorithm comparing sets and returning a boolean. This work allowed us to discover several algorithms and implement them with this generic code.

All this code, along with everything we’ve seen with sets and more is available in the sets repository on Github.

Do you also use algorithms on sets that return booleans? What would you see we can improve in set_bool_information?

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin