Published January 16, 2017 - 4 Comments

This series of posts aims at covering all there is to know in the STL (and even sligthly beyond) about searching.

Even though the need for searching for something in a collection is quite a simple concept to comprehend, there are *many* things to say to cover the topic thoroughly. Even if we’ll remain focused on how to practically accomplish what you need in C++, and won’t dive into pure algorithmics too much.

For this reason, we’ll break this topic into 3 posts:

**How to (std::)find something efficiently with the STL**: covering classical STL algorithms for performing searches on ranges of elements,

**Searching in an STL container**: how to perform efficient and correct searches when you directly have access to an STL container, as opposed to a simple range,

**The searching <algorithm>s the STL holds secret**: exploring algorithms that were unknown to the vast majority of developers I’ve presented this to, but that were deemed useful by those who did learn them.

This post shows how to search in a range. We’ll stick with the standard version of the STL and consider a range as represented by 2 iterators. However, all the following also applies to range libraries.

As we will see in more details in a dedicated post (scheduled Feb 07), the STL can be viewed as split into 2 parts: the part that operates on SORTED elements, and the one that operates on elements that are NOT SORTED.

This difference has 2 consequences for searching:

- A look up in a SORTED collection is very fast, typically in
**logarithmic time**, while a look up in a NOT SORTED collection is typically in**linear time**.

- All methods shown on SORTED ranges compare values according to
**equivalence**(comparing with`<`

), and those on NOT SORTED ranges compare values according to**equality**(comparing with).`==`

This post will show how to express the 3 following questions in C++, for a given value searched a range:

- Is it there?
- Where is it?
- Where should it be (for a sorted range)?

This question can be expressed with **std::find**, combined with a comparison with the end of the range:

1 2 3 4 |
vector<int> v = ... // v filled with values if (std::find(v.begin(), v.end(), 42) != v.end()) { ... |

Note that the question “Is it there?” can also be expressed by **std::count**:

1 2 3 4 |
vector<int> v = ... // v filled with values if (std::count(v.begin(), v.end(), 42)) { ... |

The returned value is implicitly converted to a bool in the if statement: here it evaluates to true if there is at least one element equal to 42 in the range.

The `std::count`

method has advantages and drawbacks compared to `std::find`

:

Advantages of `std::count`

:

`std::count`

avoids the comparison with the end operator.

Drawbacks of `std::count`

:

`std::count`

traverses the whole collection, while`std::find`

stops at the first element equal to the searched value,`std::find`

arguably better expresses that you are looking for something.

For these reasons, `std::find`

is more generally used for this need.

__Note__

To check for the presence of an element satisfying a predicate instead of being equal to a value, use ** std::count_if**,

`std::find_if`

`std::find_if_not`

`std::count`

and `std::find`

throughout this post.The algorithm to use is ** std::binary_search**, that directly returns a bool representing whether the searched value has equivalent elements in the collection.

1 2 |
std::set<int> numbers = // sorted elements bool is42InThere = std::binary_search(numbers.begin(), numbers.end(), 42); |

More precisely, we want to obtain iterators pointing to the occurences of the searched elements.

Use ** std::find**. It will return the iterator pointing to the first element equal to the searched value, or the end of the collection if the value has not been found.

1 2 3 4 5 6 |
std::vector<int> numbers = ... auto searchResult = std::find(numbers.begin(), numbers.end(), 42); if (searchResult != numbers.end()) { ... |

__Note on std::find for SORTED elements:
__The STL has no algorithm as straightforward as

`std::find`

for sorted collections. But `std::find`

is not really made for sorted collections because it uses equality and not equivalence, and it operates in linear time and not logarithmic time.Now for a given collection, if you are sure that for the type of your elements equality is the same as equivalence, now and in the future, and that you are prepared to pay the linear time,

`std::find`

will get you the correct result, and you’ll benefit from its straightforward interface. But in the general case, keep in mind that it is not designed for operating on a sorted range.The algorithm to use here is rather ** std::equal_range** (you thought it was

`std::lower_bound`

? Read on to the next section to see why it isn’t). Here is its prototype: 1 2 |
template< class ForwardIt, class T > std::pair<ForwardIt,ForwardIt> equal_range( ForwardIt first, ForwardIt last, const T& value ); |

`std::equal_range`

returns the range of elements equivalent to the searched value. The range represented by an **std::pair** of iterators pointing inside the collection. The 2 iterators of the pair represent the first and the past-the-end elements of the subrange of elements in the range that are equivalent to the searched value.

However its interface is somewhat clumsy to use:

1 2 3 4 5 6 |
std::vector<int> v = {3, 7, 3, 11, 3, 3, 2}; sort(v.begin(), v.end()); // equal_range, attempt 1: natively clumsy std::pair<std::vector<int>::iterator, std::vector<int>::iterator> range1 = equal_range(v.begin(), v.end(), 3); std::for_each(range1.first, range1.second, doSomething); |

A typedef or using is typically used to make it lighter:

1 2 3 4 5 6 7 8 |
std::vector<int> v = {3, 7, 3, 11, 3, 3, 2}; sort(v.begin(), v.end()); using IteratorPair = std::pair<std::vector<int>::iterator, std::vector<int>::iterator>; // equal_range, attempt 2: with the classical typedef IteratorPair range2 = equal_range(v.begin(), v.end(), 3); std::for_each(range2.first, range2.second, doSomething); |

Attempt 2 is indeed less of a mouthful, but there is still a fundamental problem left: levels of abstractions are not respected, which is contrary to the this important principle seen in a dedicated post. Indeed the pair forces us to write code in term of “first” and “second” when manipulating something returned by equal_range, whereas it should be a range. And a range should be expressed in terms of “begin” and “end”. On top on making code less natural, this becomes a real problem when you want to use this range in generic code.

To fix this, we can use a class to wrap the pair of iterators returned by `std::equal_range`

into an object that has the semantics of a range:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
template<typename Container> class Range { public: Range(std::pair<typename Container::iterator, typename Container::iterator> range) : m_begin(range.first), m_end(range.second) {} typename Container::iterator begin() { return m_begin; } typename Container::iterator end() { return m_end; } private: typename Container::iterator m_begin; typename Container::iterator m_end; }; |

This sort of class exists in ranges libraries such as Boost.Ranges or range-v3. If you go see their implementation code (here for boost and here for range-v3) you’ll see they contain way more than the simple wrapper above, that is here just to get the point across rather than be used in production code).

This effectively *lifts* a pair of iterators to the level of abstraction of a range.

Note that without the wrapper, `std::begin`

and `std::end`

cannot be used on the result of `std::equal_range`

, even if it is … a range! The wrapper fixes this problem.

It can be used the following way:

1 2 3 4 5 6 |
std::vector<int> v = {3, 7, 3, 11, 3, 3, 2}; sort(v.begin(), v.end()); // equal_range, attempt 3: natural al last Range<std::vector<int>> range3 = equal_range(v.begin(), v.end(), 3); std::for_each(range3.begin(), range3.end(), doSomething); |

Whichever of the above methods you use, equal_range returns a range, so you can check its emptiness by comparing if the 2 iterators, and check its size with **std::distance**:

1 2 |
bool noElementFound = range3.begin() == range3.end(); size_t numberOfElementFound = std::distance(range3.begin(), range3.end()) |

This question only makes sense for a sorted range, because for a not sorted range the element could be … anywhere in the range.

For a sorted range, the question is more precisely: “If it is there then where is it, and if it is not then where should it be ?”

The question can be expressed with 2 algorithms: `std::lower_bound`

**std::upper_bound**.

It is easy to understand them once you understand `std::equal_range`

: `std::lower_bound`

and `std::upper_bound`

return respectively the first and the second iterator that would have been returned by std::equal_range.

So to insert a value in the range so that is **before** the elements equivalent to this value, use **std::lower_bound** to get an iterator designating the location to insert to.

And to insert a value in the range so that is **after** the elements equivalent to this value, use **std::upper_bound** to get an iterator designating the location to insert to.

Note that you generally don’t want to use `std::lower_boud`

to simply search for an element:

Contrary to `std::find`

, you can’t simply check if the iterator returned by `std::lower_bound`

is different from the end to know whether the element is in the collection. Indeed, if the element is not present, std::lower_bound returns the location where it *should* have been, not the end of the collection.

So you need to check that the returned iterator is not the end of the range AND to check that it points to an element whose value is *equivalent* to the one you search.

Careful: *equivalent*, not equal (if you don’t know yet the difference don’t worry: we’ll see it in details in a dedicated post). But if (now or in the future) this does not mean the same thing for your type, you need to write an equivalence test, typically in the form of !(a < b) && !(b < a).

And if the sorting comparator is not `operator<`

but a custom one, you need to use the custom one. And update your code if the comparator happens to change. Clumsy. Just use `std::equal_range`

instead.

Here is a table that summarizes which algorithm to use when searching for something in a range:

Question to express in C++ | NOT SORTED | SORTED |

Is it there? | std::find != end | std::binary_search |

Where is it ? | std::find | std::equal_range |

Where should it be ? | – | std::lower_bound std::upper_bound |

In the next post in this series we will see know how to search directly in a standard container, and not on a range.

Related articles:

- Searching when you have access to an STL container
- The searching <algorithm>s the STL holds secret
- Ranges: the STL to the Next Level
- The importance of knowing STL <algorithm>s
- Respect levels of abstraction