Jonathan Boccara's blog

A Pipe Operator for the Pipes Library?

Published October 22, 2019 - 0 Comments

So far, the components of the pipes library could be assembled with operator>>=:

myVector >>= pipes::transform(f)
         >>= pipes::filter(p)
         >>= pipes::demux(pipes::transform(g) >>= pipes::push_back(output1),
                          pipes::filter(q) >>= pipes::push_back(output2));

Until recently, I thought that using operator| was impossible. But thanks to a suggestion from Fluent C++ reader Daniel and to a refactoring of the library to decouple operators from classes, this is now technically possible.

It means that the following code can be implemented:

myVector | pipes::transform(f)
         | pipes::filter(p)
         | pipes::demux(pipes::transform(g) | pipes::push_back(output1),
                        pipes::filter(q) | pipes::push_back(output2));

The most important question we’ll go over in this article is: is this a good idea?

And I would like your opinion on that question. Do you prefer operator| over operator>>=? Please leave a comment.

The code for operator| is currently in a branch and not in master yet. Depending on the feedback I get on using operator|, I will merge it or not.

In this article, we will proceed in three steps: first we’ll see why operator>>= is easier to implement than operator|. This can sound surprising at first because, after all, they’re both overloadable operators in C++, right?

Then we’ll see how to implement operator|. It turns out it’s not that difficult after all.

Finally we’ll have a discussion of the pros and the cons of each solution. Feel free to jump to that section if you’re not in a mood right now for a technical dive into the library’s code. Otherwise, let’s get to it!

Why operator>>= was easier to implement

Do you know the difference between operator| and operator>>=?

operator| is left-associative and operator>>= is right-associative.

Parsing with operator|

This means that the following expression:

input | pipes::transform(f) | pipes::push_back(output);

is parsed this way: first the components on the left are considered:

input | pipes::transform(f)

Let’s call A the result of this call to operator|.

The next step in parsing the expression is then:

A | pipes::push_back(output);

Parsing with operator>>=

Now let’s consider the equivalent expression with operator>>=:

input >>= pipes::transform(f) >>= pipes::push_back(output);

The first expression considered is the one on the right:

pipes::transform(f) >>= pipes::push_back(output);

Let’s call B the result of this call to operator>>=. The next step of parsing the expression is then:

input >>= B;

Pipes look ahead, not back

The core idea in the design of the pipes library is that pipes send data to the pipes that follow them down in the pipeline. So a given pipe has to know about the one after it in the pipeline, and doesn’t care too much about the one before it in the pipeline.

When we write:

pipes::push_back(output)

We build a pipeline that sends whatever it receives to the push_back method of output.

Then when we build B by writing this:

pipes::transform(f) >>= pipes::push_back(output)

This wraps the previous pipeline into a new one, which we called B. B starts by calling f on the values is receives before sending them to pipes::push_back that it stores.

Finally, with this last step:

input >>= B;

We iterate over input and send each value to B.

On the other hand, if you consider the case of operator|, we start with this:

input | pipes::transform(f)

Then how can we send any data from input to the pipeline? The pipeline doesn’t even have an end!!

That’s why implementing operator>>= is easier than implementing operator|.

Pipes look ahead, not back. By the way, range views look back and not ahead, which is why implementing operator| is a natural thing to do for range views.

Implementing operator| for pipes

It turns out that there is a way to implement operator| for pipes. It consists in storing pipes as well as references to the input range inside intermediary objects, until the expression is complete.

The new type of intermediary object that we need is one to store a reference to a range and a pipe. Let’s call it RangePipe:

template<typename Range, typename Pipe>
struct RangePipe
{
    Range& range;
    Pipe pipe;
    
    template<typename Pipe_>
    RangePipe(Range& range, Pipe_&& pipe) : range(range), pipe(FWD(pipe)) {}
};

Note that it takes the pipe as a template local to the constructor, so that there is a template type deduction and the magic of forwarding references can happen.

FWD is the usual macro that expands to std::forward<decltype(pipe)>, in order to avoid burdening the code with technical constructs.

We can provide a convenience function as the library is compatible with C++14:

template<typename Range, typename Pipe>
auto make_range_pipe(Range&& range, Pipe&& pipe)
{
    return detail::RangePipe<std::remove_reference_t<Range>, std::decay_t<Pipe>>{FWD(range), FWD(pipe)};
}

Armed with the RangePipe class, we can now write operator| with various overloads to cover the possible use cases of building a pipeline:

// range | pipe

template<typename Range, typename Pipe, detail::IsARange<Range> = true, detail::IsAPipe<Pipe> = true>
auto operator|(Range&& range, Pipe&& pipe)
{
    return detail::make_range_pipe(FWD(range), FWD(pipe));
}

// RangePipe | pipe

template<typename Range, typename Pipe1, typename Pipe2, detail::IsAPipe<Pipe2> = true>
auto operator|(detail::RangePipe<Range, Pipe1> rangePipe, Pipe2&& pipe2)
{
    return detail::make_range_pipe(FWD(rangePipe.range), detail::make_composite_pipe(rangePipe.pipe, FWD(pipe2)));
}

// pipe | pipe

template<typename Pipe1, typename Pipe2, detail::IsAPipe<Pipe1> = true, detail::IsAPipe<Pipe2> = true>
auto operator|(Pipe1&& pipe1, Pipe2&& pipe2)
{
    return detail::make_composite_pipe(FWD(pipe1), FWD(pipe2));
}

// RangePipe | pipeline

template<typename Range, typename Pipe, typename Pipeline, detail::IsAPipeline<Pipeline> = true>
auto operator|(detail::RangePipe<Range, Pipe> rangePipe, Pipeline&& pipeline)
{
    return rangePipe.range >>= rangePipe.pipe >>= FWD(pipeline);
}

// pipe | pipeline

template<typename Pipe, typename Pipeline, detail::IsAPipe<Pipe> = true, detail::IsAPipeline<Pipeline> = true>
auto operator|(Pipe&& pipe, Pipeline&& pipeline)
{
    return FWD(pipe) >>= FWD(pipeline);
}

// Range | pipeline

template<typename Range, typename Pipeline, detail::IsARange<Range> = true, detail::IsAPipeline<Pipeline> = true>
auto operator|(Range&& range, Pipeline&& pipeline)
{
    return FWD(range) >>= FWD(pipeline);
}

Note that composite pipes existed before and allowed to assemble several pipes together and hold them until they were completed later with the rest of the pipeline.

If you see something that looks wrong with this code, do let me know. I can’t guarantee that this code is devoid of all bugs, but what I know is that is passes its unit tests.

Some pros and cons for operator|

Here are some arguments I see in favour of operator|.

Pros of operator|

One argument for operator| is that it would be consistent with range views that are planned to be included in C++20:

auto r = myVector | ranges::view::transform(f)
                  | ranges::view::filter(p)
                  | ranges::view::reverse;

And pipes are compatible with ranges in the sense that you can send the result of a range view into a pipe:

auto r = myVector | ranges::view::transform(f)
                  | ranges::view::filter(p)
                  | ranges::view::reverse;
                  | pipes::transform(g)
                  | pipes::demux(pipes::push_back(output1),
                                 pipes::filter(q) | pipes::push_back(output2));

Whereas the code with operator>>= would look like that:

auto r = myVector | ranges::view::transform(f)
                  | ranges::view::filter(p)
                  | ranges::view::reverse;
                  >>= pipes::transform(g)
                  >>= pipes::demux(pipes::push_back(output1),
                                   pipes::filter(q) >>= pipes::push_back(output2));

Also, operator| is called a pipe operator. That kind of sounds good for a library called pipes.

Cons of operator|

The left associative operator| hides the design of the pipes library, which is that pipes look ahead. A right associative operator such as operator>>= suggests that the pipelines are built from right to left.

Also, as we saw in the implementation, operator| stores pipes into intermediary objects, which can incur moves or copies of pipes. But as with function objects in the STL, we expect the functions passed around to be cheap to copy.

Finally, operator>>= kind of looks like sending data to a pipe:

pipe operator

Over to you

Now you know everything there is to know about the question of replacing operator>>= with operator| for the pipes library.

Which one do you think is better, operator| or operator>>=? Do you see other pros or cons for those operators?

Please leave a comment below, I’d be grateful for your feedback.

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin