Published January 9, 2017 - 4 Comments

This post is part of the STL learning resource. To get the bigger picture of the STL topics that I intend to cover on Fluent C++, you can go have a look at the dedicated page at fluentcpp.com/STL.

The purpose of this series of posts is to give you an opportunity to accumulate — no pun intended! — knowledge on the STL one bit at a time, this time focusing on **algorithms on sets**.

Here the word “set” is taken in the general sense of a collection of elements, and not only std::set, provided that the range is **sorted**. Indeed all the algorithms mentioned in this post require their input ranges to be sorted. Similarly their output ranges — when they produce one — are sorted too.

The STL features 4 complementary algorithms that can take various parts of 2 given sets. They have a common form of prototype taking two ranges in input and putting their results in one range in output:

1 2 3 4 |
template<typename InputIterator1, typename InputIterator2, typename OutputIterator> OutputIterator algo(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, InputIterator2 last2, OutputIterator result); |

So for two sorted collections A and B, the invocation of one such algorithm would typically look like:

1 |
algo(A.begin(), A.end(), B.begin(), B.end(), result); |

`result`

can typically be an `std::back_inserter`

over a vector as seen in this post, or any other output iterator.

For the examples, let’s consider 2 sets A and B.

`std::set_difference`

`std::set_difference`

copies into result all the elements that are in A, but not in B.

Here is what code using `std::set_difference`

looks like:

1 2 3 4 5 6 7 8 9 10 11 12 13 |
#include <algorithm> #include <iterator> #include <set> #include <vector> std::vector<int> A = ... // sorted vector std::set<int> B = ... // std::set is always sorted std::vector<int> results; std::set_difference(A.begin(), A.end(), B.begin(), B.end(), std::back_inserter(results)); |

`std::set_intersection`

`std::set_intersection`

copies into result all the elements from A that are also in B.

`std::set_union`

`std::set_union`

copies into result all the elements that are in A, in B, or in both. For those that are in both, the A versions will be taken (unless there are more occurrences of a common element in B than in A, in which case its additional versions in B are also taken).

`std::set_symmetric_difference`

Behind its funny name, `set_symmetric_difference`

simply copies into result the elements that are in A but not in B, and those that are in B but not in A.

`set_symmetric_difference`

is a particularly good example of algorithm that sounds like complicated when you come across it in a reference site listing all algorithms. But you can see that it is in reality very simple to understand and can be useful in day-to-day coding. This happens fairly often with STL algorithms.

We will see more algorithms comparing two collections in the post dedicated to predicates on collections, but here I want to specifically mention ** std::includes**, because it operates on sets, that are collections of elements in sorted order as explained earlier.

Given 2 sorted collections A and B, `std::includes`

checks whether all the elements of B are also in A.

Its prototype is:

1 2 3 |
template<typename InputIterator1, typename InputIterator2> bool std::includes(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, InputIterator2 last2 ); |

and it is typically used the following way:

1 |
bool AincludesB = std::includes(A.begin(), A.end(), B.begin(), B.end()); |

`std::merge`

`std::merge`

is used to merge two sorted collections into one sorted collection. Its prototype is:

1 2 3 4 |
template<typename InputIterator1, typename InputIterator2, typename OutputIterator> OutputIterator merge(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, InputIterator2 last2, OutputIterator result); |

and given 2 sorted collections A and B, merging A and B into a sorted range starting at `result`

is typically done the following way:

1 |
std::merge(A.begin(), A.end(), B.begin(), B.end(), result); |

`std::inplace_merge`

Let’s state it clearly: while all the algorithms of this posts are frequently useful in code, `std::inplace_merge`

is very seldom used. I want to describe it for the sake of comprehensiveness since one of the objectives of the STL learning resource is to cover ALL algorithms, but if you’re not curious about the algorithms and merely interested in practical consequences on your code, you can just skip over to the next section.

Ok, so if you’re still here let’s dig into `std::inplace_merge`

. This algorithm takes one collection and does a merge *directly inside it.* As a comparison, `std::merge`

took two collections and output its results in a third one. `std::inplace_merge`

regards the collection it operates on as two consecutive parts, and merges the first part with the second one.

More precisely, its prototype is

1 2 |
template<typename BidirectionalIterator> void inplace_merge(BidirectionalIterator first, BidirectionalIterator middle, BidirectionalIterator last ); |

where:

`first`

is the beginning of the collection, which is also the beginning of the first part,`middle`

is the beginning of the second part of the collection,`last`

is the end of the collection, which is also the end of its second part.

People I show this to often ask the following question: how does `std::inplace_merge`

differ from `std::sort`

? And the answer lies in the difference in the pre-condition: in `std::inplace_merge`

, the first part and the second part are already themselves sorted, because `std::inplace_merge`

is an algorithm on sets.(there is also a rather technical difference, which is that `std::sort`

requires random access iterators while `std::inplace_merge`

only requires bidirectional iterators).

Where can `std::inplace_merge`

be useful ? Typically in the implementation of a merge sort algorithm.

All the algorithms seen in this post require that their input and output ranges are sorted. This is important to remember for two reasons:

- if you pass input ranges that are not sorted to any of these algorithms,
**results will be wrong**. Indeed these algorithms take assumptions based on the fact that input ranges are sorted. If this is not true then these assumptions become false.

- these assumptions let the algorithms perform their job more quickly: typically in a O(n) complexity instead of a O(N*logN) that would have been incurred on unsorted ranges.

We saw all the algorithms the STL offers to operate on sets, that are collections of sorted elements, in the general sense.

How do all these algorithms **compare** the elements they manipulate, in order to check what to do with them ? It is crucial to understand this when using these algorithms, and it will be the subject of a dedicated post (scheduled for Jan 31).

Now I want to ask you: what did you think of this post? Was it useful to you? Please, share your feedback and let me know if this kind of presentation of STL algorithms is helpful for you. This will help me shape future posts in order to bring you the most value that I can.

Related articles

Share this post! Don't want to miss out ?