Jonathan Boccara's blog

std::iterator is deprecated: Why, What It Was, and What to Use Instead

Published May 8, 2018 - 4 Comments

C++17 has deprecated a few components that had been in C++ since its beginning, and std::iterator is one of them.

If you don’t have C++17 in production, you’re like most people today. But one day or the other, your will have it, most likely. And when that day comes, you’ll be glad you anticipated the deprecation of such components, and stopped using them well in advance.

Let’s see how std::iterator was used, why it was deprecated, and what to use instead.

Iterator traits

std::iterator was used to specify the traits of an iterator.

What does that mean?

Generic code that uses iterators, such as the STL algorithms which use them intensely, needs information about them. For example, it needs the type of the object that the iterators refer to. To obtain this information, the STL requires that the iterator it operates on must define a type called value_type.

To illustrate, consider the algorithm std::reduce. One of its overloads takes two iterators and returns the sum of the objects contained between those two iterators:

std::vector<int> numbers = {1, 2, 3, 4, 5};
    
std::cout << std::reduce(begin(numbers), end(numbers)) << '\n';

This should output 15, which is the sum of the elements inside numbers.

But what if the collection of number was empty?

std::vector<int> numbers = {};
    
std::cout << std::reduce(begin(numbers), end(numbers)) << '\n';

What should this code output? The spec of std::reduce says that it should return an object of the type of elements, value constructed (which essentially means, constructed with {}). So in our case that would be int{}, which is 0.

But how does std::reduce know that the type of the elements of the vector numbers is int? Indeed, it has no connection with the vector, as it only interacts with its iterators coming from the begin and end functions.

This is why iterators must provide a ::value_type, which is, in this case, the value of the elements of the vector. So int.

Another example of required information is the capabilities of the iterator: is it just an input iterator, that supports ++ but should not be read twice? Or a forward iterator that can be read several times? Or a bidirectional that can also do --? Or a random access iterator, that can jump around with +=, +, -= and -? Or an output iterator?

This piece of information is useful for some algorithms that would be more or less efficient depending on those capabilities. Such an algorithm typically has several implementations, and chooses one to route to depending on the category of the iterator.

To achieve this routing, the STL requires that iterators provide a type called iterator_category, that can be either one of:

  • std::input_iterator_tag,
  • std::forward_iterator_tag,
  • std::bidirectional_iterator_tag,
  • std::random_access_iterator_tag.

Finally, the other types than value_type and iterator_category required by the STL on iterators are:

  • difference_type: the type that results from a difference - of two such iterators,
  • pointer: the type pointer to the element that the iterator refers to,
  • reference: the type reference to the element that iterator refers to.

Which makes up 5 types to define.

All the iterators in the standard library comply by this (static) interface. If you need to implement your own iterator, you also need to provide those types.

std::iterator_traits

If you want to access those types on a given iterator, you may think that you can rely on the iterator to provide the 5 types. And to be able to call Iterator::value_type for example.

This is mostly true, but there is one exception: when the iterator is in fact a pointer. Some STL implementations use a pointer to stand for the iterator of a vector (indeed, pointer arithmetics does a fine job of +=, and other usual iterator manipulations). And it is also the case for iterating over an C-style array.

In such cases, you can’t just do something like int*::value_type, since pointer don’t have nested types!

To cover that case, the convention is not to call ::value_type or ::iterator_category directly, but rather to add a level of indirection. This level of indirection is a template called std::iterator_traits, that exposes the same 5 types.

If the template type Iterator of std::iterator_traits<Iterator> is not a pointer, then the types of std::iterator_traits are just forwarded to those of the Iterator. For example:

std::iterator_traits<Iterator>::value_type

is defined as

Iterator::value_type

But if the template type is a pointer, say T*, then std::iterator_traits<T*>::value_type is hardcoded as T, and std::iterator_traits<T*>::iterator_category is hardcoded as std::random_access_iterator_tag.

std::iterator

std::iterator is a helper to define the iterator traits of an iterator.

std::iterator is a template, that takes 5 template parameters:

template< 
    typename Category,
    typename T,
    typename Distance = std::ptrdiff_t,
    typename Pointer = T*,
    typename Reference = T& 
> struct iterator;

Those 5 names sound familiar, right? Those template types correspond to the 5 types required by the STL on iterators.

The job of std::iterator is to expose those types. Here is one possible implementation of std::iterator:

template< 
    typename Category,
    typename T,
    typename Distance = std::ptrdiff_t,
    typename Pointer = T*,
    typename Reference = T& 
> struct iterator
{
    using iterator_category = Category;
    using value_type = T;
    using difference_type = Distance;
    using pointer = Pointer;
    using reference = Reference;
};

std::iterator allows an iterator to define this 5 types, by inheriting from std::iterator and passing it those types (at least the first 2 since the other 3 have default values):

class MyIterator : public std::iterator<std::random_access_iterator, int>
{
    // ...

By inheriting from std::iterator,  MyIterator also exposes the 5 types.

Why deprecate std::iterator?

This all seems very useful, so why deprecate this functionality?

The important thing to note is that the deprecation only concerns std::iterator. So it does not concern the types that the STL expects from an iterator, and neither does it concern the idea that an iterator should provide information to the code that uses it.

What is deprecated is the technique of inheriting from std::iterator to define those types. That’s it. The rest stays, including std::iterator_traits for example.

Now, what’s wrong with std::iterator?

At least one thing that is wrong with it is that the iterator that inherits from it provides the 5 types without being explicit about which one is which. For instance:

class MyIterator : public std::iterator<std::forward_iterator_tag, int, int, int*, int&>
{
    // ...

This code doesn’t say to which type of the interface (value_type, reference_type…) each of the types passed corresponds.

A more explicit way to go about it is to write the using declarations (or typedefs if you’re before C++11) directly inside of the iterator:

class MyIterator
{
public:
    using iterator_category = std::forward_iterator_tag;
    using value_type = int;
    using difference_type = int;
    using pointer = int*;
    using reference = int&;

    // ...

And this is how we’re expected to define the types exposed by our iterators now.

EDIT: to quote the P0174 paper that advocated for the deprecation of std::iterator, the lack of clarity is even more visible when defining an output iterator:

class MyOutputIterator : public std::iterator<std::output_iterator_tag, void, void, void, void>
{
    // ...

And even though the reason of clarity was enough to convince the committee to deprecate std::iterator, there was also another drawback to it: you can’t access the aliases inside of the base class directly. For example you can’t reach value_type this way:

class MyIterator : public std::iterator<std::forward_iterator_tag, int>
{
    value_type data;

    // ...

What’s more, the fact that some of the STL iterators are depicted as inheriting from std:iterator was seen in LWG2438 as potentially confusing for users because they could “be misled into thinking that their own iterators must derive from std::iterator or that overloading functions to take std::iterator is somehow meaningful”.

end of EDIT, thanks Reddit user /r/tcanens for pointing this out]

The issues with the new method

No more default parameters

You may have noticed that std::iterator had default template parameters:

template< 
    typename Category,
    typename T,
    typename Distance = std::ptrdiff_t,
    typename Pointer = T*,
    typename Reference = T& 
> struct iterator;

Which meant that, if there wasn’t a specificity on the last 3 types that forced you to define them, you could get away with defining just the first two:

class MyIterator : public std::iterator<std::forward_iterator_tag, int>
{
    // ...

Now, to my knowledge, this is no longer possible: you have to write the 5 types definitions in full inside your iterator.

The case of output iterators

Output iterators, such as std::back_inserter (or, to be more accurate, the iterator generated by that function), also have to expose certain types. In particular their iterator_category is std::output_iterator_tag, and the other types are void.

My understanding as to why the last 4 types must be void is that they are not used anyway. With std::iterator, we used to define output iterators this way:

class MyOutputIterator : public std::iterator<std::output_iterator_tag, void, void, void, void>
{
    // ...

We used to fill out the types in std::iterator with void, just for the sake of putting something.

When I learned about the deprecation of std::iterator and the new way of providing the types, I first thought that it would be more convenient for defining output iterators.

Indeed, the only type that matters is the iterator category, and I thought we could just forget about specifying the other types:

class MyOutputIterator
{
public:
    using iterator_category = std::output_iterator_tag;
    // that's it, no more aliases

    // rest of the iterator class...

And then I realized that this was completely wrong. Indeed, some platforms won’t accept your code if you don’t define the 5 types. So you still have to go and define the 4 aliases to void:

class MyOutputIterator
{
public:
    using iterator_category = std::output_iterator_tag;
    using value_type = void; // crap
    using difference_type = void;
    using pointer = void;
    using reference = void;

    // rest of the iterator class...

If you’re interested, we now get into more details about why some platforms will let you get away with only the std::iterator_category and some won’t.

And if you don’t feel getting into such details right now you can hop onto the conclusion. But the bottom line is that, if you want your iterator code to be portable, you need to define the 5 types.

So, how come some platforms force you to write the 5 types even if you don’t use them all?

On libstdc++, used by gcc

If you peek into libstdc++, used by gcc, you’ll see that std::iterator_traits is implemented as:

template<typename _Iterator>
struct iterator_traits
{
    typedef typename _Iterator::iterator_category iterator_category;
    typedef typename _Iterator::value_type        value_type;
    typedef typename _Iterator::difference_type   difference_type;
    typedef typename _Iterator::pointer           pointer;
    typedef typename _Iterator::reference         reference;
};

This implies that, as soon as you try to access one member, such as ::iterator_category for example, the whole structured and all its typedefs are instantiated. If one of them doesn’t exist, this leads to a compilation error.

On libc++, used by clang

And if you go look into libc++, used by clang, you’ll observe that std::iterator_traits has a different implementation:

template <class _Iter>
struct _LIBCPP_TEMPLATE_VIS iterator_traits
    : __iterator_traits<_Iter, __has_iterator_category<_Iter>::value> {};

The typedefs are not directly inside iterator_traits. Instead, they are in its base class. And this makes a whole difference: if you try to use one of those typedefs in your code (say, ::iterator_category for instance), your code will compile even if another one (say, ::value_type) is missing.

To be honest, I don’t know which language rule explains that difference. If you know, now is a good time to share your knowledge in the comments section.

In any case, the bottom line is that one of the major platforms won’t let you get away with it, so do specify all 5 types to stay away from such portability issues.

Conclusion

std::iterator is deprecated, so we should stop using it. Indeed, the next step after deprecation could be total removal from the language, just like what happened to std::auto_ptr.

But contrary to std::auto_ptr, the alternative to std::iterator is trivial to achieve, even in C++03: just implement the 5 aliases inside of your custom iterators. And even if your code doesn’t use the 5 of them, do define them to make sure your code stays portable.

Now, you may wonder, does it really happen that we create iterators? To answer that question, I invite you to have a look at Smart Output Iterators!

Related articles:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed