Jonathan Boccara's blog

An Alternative Design to Iterators and Ranges, Using std::optional

Published April 16, 2019 - 0 Comments

Today’s guest post is written by Vincent Zalzal. Vincent is a software developer working in the computer vision industry for the last 13 years. He appreciates all the levels of complexity involved in software development, from how to optimize memory cache accesses to devising algorithms and heuristics to solve complex applications, all the way to developing stable and user-friendly frameworks. You can find him online on Twitter or LinkedIn.

In a previous post, Jonathan presented what he calls the Terrible Problem Of Incrementing A Smart Iterator, or the TPOIASI. The problem occurs when an iterator that embeds logic in its operator++ is composed with another iterator that performs some computation in its operator*. The TPOIASI is prevalent in code using the new C++ Ranges or ranges-v3.

I was intrigued by the problem, and decided to try solving it. While Jonathan decided to move the logic to smart output iterators to solve the problem, I decided to change the definition of range altogether.

Motivating example

Here is an example of the problem, using ranges-v3:

And here is the output:

You can refer to Jonathan’s article for a detailed explanation of what’s going on. In summary, filter has to call both operator++ and operator* of the underlying iterator in its own operator++ to know when to stop, causing transform to apply its function (its operator*) twice per valid element: once in filter‘s operator++ and once in filter‘s operator*.

Suppose you are iterating over the filtered range like this:

The transform function is first called while performing ++it to stop when the predicate is true, then it is called again right on the next line, in *it. Wouldn’t it be nice if we could reuse the function evaluation in ++it instead of calling *it?

Insight

Is it really necessary to have separate operations for advancing the iterator and evaluating its element?

If those two operations were to be merged into a single one, the spurious calls to the transform function would be avoided. Jonathan’s solution using smart output iterators is actually doing all the work in the output iterator’s operator=.

What if we could reinvent ranges from scratch without the need for low-level iterators? Could we leverage modern C++ features to iterate an input range with a single operation instead of two?

A Solution using std::optional

A solution is to represent an input range as a mutable view, i.e. a mutable structure that contains both the current position and the sentinel (the value returned by std::end). This way, we could define a single operation, let’s call it next, that would return either the next element, or std::nullopt if the end of the range is reached.

I made the following design decisions to simplify the implementation:

The downside of such a range is its size: it is twice the size of an STL iterator. This is only important if you are storing iterators in memory, though, which, in my opinion, is often not the best design anyway.

The filtered range is as easy to define as for standard ranges, maybe even easier, and it solves the problem presented in the motivating example.

Because next is performing both iteration and element evaluation, each element is evaluated exactly once.

The transformed range is even easier to define:

With appropriate deduction guides, these structs are enough to implement the motivating example.

No TPOIASI, success!

Pipe syntax

I was happy with the result, but unsatisfied with the syntax. Under the encouragement of Jonathan, I implemented a basic mechanism to achieve a pipe syntax similar to the one in ranges-v3.

We would like to be able to replace this:

by this:

To achieve this, we must overload operator| to take any range as left-hand side operand, and an object returned by transform as right-hand side operand, that object temporarily holding the function to apply. Here is what it looks like, including the deduction guide for TransformedRange:

By doing the same thing for the filter function and adding a factory function to create the input range, we get this much nicer-looking code:

Here is the full code listing. You can see it in action on Coliru.

STL algorithms and range-based for loop

You might wonder why I am not using std::copy to push back elements into the output vector, or why I create a temporary variable to hold the range. This is because InputRange, FilteredRange and TransformedRange do not play nicely with existing C++ features and libraries. The range-based for statement:

is currently equivalent to:

Let’s imagine an alternate universe where the range-based for loop would instead be based on next:

In this C++ fantasy land, STL algorithms would also have overloads taking such a range as first argument. Then, we would finally get this coveted version of the motivating example:

Performance

You wouldn’t be a real C++ programmer if you weren’t caring about performance, would you? You will be happy to know that most recent compilers see through all the abstraction layers of proxy objects and std::optionals. gcc-trunk in particular generates almost the exact same code as a handwritten loop doing all computations inline, as can be seen on Compiler Explorer. Very impressive!

Note that, at the time of writing, gcc-trunk seems to be the only version of x86-64 gcc on Compiler Explorer to generate that code, so your mileage may vary.

Conclusion

In the book From Mathematics to Generic Programming, Alexander Stepanov and Daniel Rose describe the Law of Useful Return:

If you’ve already done the work to get some useful result, don’t throw it away. Return it to the caller. This may allow the caller to get some extra work done “for free”.

For example, since C++11, std::rotate returns an iterator to the new position of the previously-last iterator. Maybe it won’t be used, but it was already computed anyway.

In this article, I applied this programming principle to operator++ for filter iterators. When incrementing the iterator, its current value must be evaluated to determine if it satisfies the predicate or not. That evaluated value should be returned instead of being discarded.

By combining both operator++ and operator* into a single function, it is possible to both increment the iterator and return the evaluated value, thus avoiding the Terrible Problem Of Incrementing A Smart Iterator: evaluating the value twice. Moreover, I think any programmer that once implemented an iterator class will agree that it is not a trivial task, and implementing FilteredRange and TransformedRange above required quite less boilerplate code.

Thinking out of the box when solving toy problems may sometimes lead to interesting insights. I hope you had as much fun reading this article as I had fun writing it. Thanks to Tim van Deurzen for providing constructive feedback, and thanks to Jonathan for giving me the opportunity again of writing a guest post on his blog. Happy coding!

You will also like

Become a Patron!
Share this post! Facebooktwittergoogle_pluslinkedin    Don't want to miss out ? Follow:   twitterlinkedinrss

Get a free ebook on C++ smart pointers

Get a free ebook of more than 50 pages that will teach you the basic, medium and advanced aspects of C++ smart pointers, by subscribing to our mailing list! On top of that, you will also receive regular updates to make your code more expressive.

No spam. You can unsubscribe at any time.