Jonathan Boccara's blog

How to Remove Pointers from a Vector in C++

Published September 18, 2018 - 6 Comments

Daily C++

Today we have a post co-written with Gaurav Sehgal, a software engineer who works with C and C++. Gaurav can be found on his Stack Overflow profile as well as on LinkedIn.

Interested in writing on Fluent C++ too? Check out our guest posting area!

As we saw in the article about removing elements from a sequence container, to remove elements in a vector based on a predicate, C++ uses the erase-remove idiom:

vector<int> vec{2, 3, 5, 2};

vec.erase(std::remove_if(vec.begin(), vec.end(), [](int i){ return i % 2 == 0;}), vec.end());

Which we can wrap in a more expressive function call:

vector<int> vec{2, 3, 5, 2};

erase_if(vec, [](int i){ return i % 2 == 0; });

The resulting vec in both those examples contains {3, 5} after the call to the algorithm. If you’d like a refresher on the erase-remove idiom, which we use in this post, check out the dedicated article about it.

This works fine with vector of values, such as vectors of integers for example. But for vector of pointers this is not as straightforward, since memory management comes into play.

Removing from a vector of unique_ptrs

C++11 introduced std::unique_ptr along with other smart pointers, that wrap a normal pointer and takes care of memory management, by calling delete on the pointer in their destructors.

This allows to manipulate pointers more easily, and allows in particular to call std::remove and std::remove_if on a vector of std::unique_ptrs for example without a problem:

auto vec = std::vector<std::unique_ptr<int>>{};
vec.push_back(std::make_unique<int>(2));
vec.push_back(std::make_unique<int>(3));
vec.push_back(std::make_unique<int>(5));
vec.push_back(std::make_unique<int>(2));

(for reasons outside of the scope of this post, vectors of unique_ptr can’t use a std::initializer_list)

vec.erase(std::remove_if(vec.begin(), vec.end(), [](auto const& pi){ return *pi % 2 == 0; }), vec.end());

Or by wrapping the erase-remove idiom:

erase_if(vec, [](auto const& pi){ return *pi % 2 == 0; });

This code effectively removes the first and last elements of the vector, that pointed to even integers.

Note that since std::unique_ptr can’t be copied but only moved, the fact that this code compiles shows that std::remove_if doesn’t copy the elements of the collection, but rather moves them around. And we know that moving a std::unique_ptr u1 into a std::unique_ptr u2 takes the ownership of the underlying raw pointer from u1 to u2, leaving u1 with a null pointer.

As a result, the elements placed by the algorithm at the beginning of the collection (in our case the unique_ptr to 3 and the unique_ptr to 5) are guaranteed to be the sole owners of their underlying pointers.

All this handling of memory happens thanks to unique_ptrs. But what would happen with a vector of owning raw pointers?

Removing from a vector of owning raw pointers

First, let’s note that a vector of owning raw pointers is not recommended in modern C++ (even using owning raw pointers without a vector are not recommended in modern C++). std::unique_ptr and other smart pointers offer safer and more expressive alternative since C++11.

But even though modern C++ is pioneering ever more, not all codebases in the world are catching up at the same pace. This makes it possible for you to encounter vectors of owning raw pointers. It could be in a codebase in C++03, or in a codebase that use modern compilers but still contains older patterns in its legacy code.

Another case where you would be concerned is if you write library code. If your code accepts a std::vector<T> with no assumption on type T, you could be called from legacy code with a vector of owning raw pointers.

The rest of this post assumes that you have to deal with vector of owning raw pointers from time to time, and that you have to remove elements from them. Then using std::remove and std::remove_if is a very bad idea.

The problem of std::remove on raw pointers

To illustrate the problem, let’s create a vector of owning raw pointers:

auto vec = std::vector<int*>{ new int(2), new int(3), new int(5), new int(2) };

If we call the usual erase-remove pattern on it:

vec.erase(std::remove_if(vec.begin(), vec.end(), [](int* pi){ return *pi % 2 == 0; }), vec.end());

Then we end up with a memory leak: the vector no longer contains the pointers to 2, but no one has called delete on them.

So we may be tempted to separate std::remove_if from the call to erase in order to delete the pointers at the end of the vector between the calls:

auto firstToErase = std::remove_if(vec.begin(), vec.end(), [](int* pi){ return *pi % 2 == 0; });
for (auto pointer = firstToErase; pointer != vec.end(); ++pointer)
   delete *pointer;
vec.erase(firstToErase, vec.end());

But this doesn’t work either, because this creates dangling pointers. To understand why, we have to consider one of the requirements (or rather, absence of) of std::remove and std::remove_if: the elements they leave at the end of the vector are unspecified. It could be the elements that were there before calling the algorithm, or the elements that satisfied the predicate, or anything else.

In a particular STL implementation, the elements left at the end of the container after the call to std::remove_if turned out to be the ones that were there before calling the algorithm. As the vector had pointers to 2 3 5 2 before calling std::remove, it had pointers to 3 5 5 2 after.

For example, printing the values inside of the vector before calling std::remove could output this:

0x55c8d7980c20
0x55c8d7980c40
0x55c8d7980c60
0x55c8d7980c80

And after the call to std::remove it outputs that:

0x55c8d7980c40
0x55c8d7980c60
0x55c8d7980c60
0x55c8d7980c80

Then the innocent call to erase will delete the pointer in the 3rd position, making the one in the second position (equal to it) a dangerous dangling pointer!

What to do instead

You can use std::stable_partition instead of std::remove_if, with an inverted predicate. Indeed, std::stable_partition performs a partitioning of the collection based on a predicate. This means to put the elements that satisfy the predicate at the beginning, and the elements that don’t satisfy the predicate at the end. No more equal pointers.

The paritioning here consists in putting the elements not to remove at the beginning, hence the need to invert the predicate:

std::stable_partition(vec.begin(), vec.end(), [](int* pi){ return *pi % 2 != 0; });

std::stable_partition returns the partition point of the collection, which is the iterator to the first element that doesn’t satisfy the predicate after partitioning. We therefore have to delete the pointers from this point and until the end of the vector. After that, we can erase the elements from the vector:

auto firstToRemove = std::stable_partition(vec.begin(), vec.end(), [](int* pi){ return *pi % 2 != 0; });
std::for_each(firstToRemove, vec.end(), [](int* pi){ delete pi; });
vec.erase(firstToRemove, vec.end());

Another solution is to delete the pointers to remove and set them to nullptr and only then perform a std::remove on nullptr:

for(auto& pointer : vec)
{
   if (*pointer % 2 == 0)
   {
       delete pointer;
       pointer = nullptr;
   }
}
vec.erase(std::remove(vec.begin(), vec.end(), nullptr), vec.end());

Since the deletes are performed before the call to std::remove, there is no longer the problem with dangling pointers. But this solution only works if the vector cannot contain null pointers. Otherwise, they would be removed along with the ones set by the for loop.

Be careful around owning raw pointers

In conclusion, prefer unique_ptrs or other smart pointers over owning raw pointers. It will make your code simpler and more expressive.

And if you do have to work with vector of owning raw pointers, choose the right STL algorithm to correctly handle memory management!

You will also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed