Jonathan Boccara's blog

Heaps and Priority Queues in C++ – Part 1: Heaps Basics

Published March 13, 2018 - 10 Comments

One of our 7 good resolutions for the new year was to learn our data structures. Indeed, using the right data structure simplifies the code, and knowing them lets you understand the code that uses them.

Let’s see two related data structures, heaps and priority queues. This is a deep topic that we are going to explore in a mixed series of articles and videos:

  • Part 1: Heaps Basics
  • Part 2: Building, Unbuilding and Sorting Heaps (video)
  • Part 3: Queues, Priority Queues and Heaps
  • Part 4: What Heaps Brings That Priority Queues Don’t (video)

Starting now with Heaps Basics.

What is a heap?

A heap is a data structure that has the form of a tree and that respects the heap property, namely: every node must be lower than each of its children.

I suppose that the name “heap” comes from the fact that if you pile up a heap of stuff, you’d rather put the big things at the bottom and the small at the top if you want it to hold:

heaps C++ STL

Note that it is completely unrelated to the heap as in the memory region that contains dynamically allocated objects (as opposed to the stack, which happens to also be the name of a data structure too by the way).

One of the most important properties of the heap is that its lowest element at its root, to be easily accessible.

In a heap, each node can theoretically have any number of children. But in the STL, heaps’ nodes have two children, so by heap we will designate binary heaps in this article.

Max heaps

The heap property, that every node must be lower than its children, can be generalized to another comparison than “lower than” as in operator<. We could use a certain relation that makes more sense for the data type that is in the heap. For instance, a heap of sets could use a lexicographical relationship.

In particular, we can also use the relation “greater than” in the heap property (which can still be implemented by using operator< by turning around the heap property and ensuring that children are lower than their parents).

Such a heap is called a max heap, and this is the sort of heap that the STL has. So by heap I will mean binary max heap throughout this article.

In a max heap, the largest element is at the root. So here is an example of a heap:

max heap C++ STL

You can see that each node is lower than its parent, and the greatest node (9) is at the root.

Using “greater than” gets us away from the metaphor of the heaps of stones/trash/boxes that we can see in the world surrounding us, but hey, do we developers really live in the world surrounding us?

Implementing a heap

To represent a binary tree such as a heap, one implementation is to make a dynamic allocation for each node, with 2 pointers pointing to its children.

But there is a much more efficient (and elegant) implementation: representing it in the form of an array, by doing a level order traversal of the heap. Said differently, it means that the array starts with the element at the root, then follows with the children of that root, then all the children of those children. And then the great-grand-children. And so on.

This way, the greatest element is at the first position of the array.

This animation illustrates how to above heap could be represented as an array:

heap array C++ STL

This is how the STL represents heaps: a heap can be stored in an std::vector for example, with the elements laid out next to each other like above.

This representation is more efficient than having nodes pointing to each other for several reasons:

  • there is only one dynamic allocation for all the heap, and not one per node,
  • there is no pointers to children, so no space needed for them,
  • the contiguous layout of the structure makes it more cache-friendly.

This is all well, but we can no longer walk up and down the nodes of the tree, since we have no pointer to children (or parent). Or can we?

Walking around the heap

It turns out that we can. Indeed, a nice property of binary trees represented as arrays is that, to get to the left child of a node at a certain index i, we can just hop to the index (i + 1) * 2 - 1 to get to the left child, and to the index (i + 1) * 2 for the right child.

heaps C++ STL

If those formulae look more like incantations to you, have a look at our heap represented as an array, with indices starting at 1 underneath it:

max heap C++ STLAnd compare this with its initial tree-like representation. Notice how the two children of an node at position i are at the position i * 2 and i * 2 + 1?

This is true when indices start at 1.

But since in an std::vector, indices start at 0, the left child of a node at position index is located at a position given by:

size_t leftChild(size_t index)
{
    return (index + 1) * 2 - 1;
}

And the position of the right child of a node at position index is given by:

size_t rightChild(size_t index)
{
    return (index + 1) * 2;
}

Let’s keep those, they will come in handy later in our series on heaps and priority queues.

Making and checking for heaps with the STL

Now that we’re clear on the representation of a heap as an array, let’s see some of the algorithms that the STL offers to manipulate heaps inside arrays.

Making heaps with std::make_heap

If you have a range of objects that can be compared with each other, you can rearrange this range into a max heap with std::make_heap.

Consider the following code to illustrate:

std::vector<int> numbers = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

std::make_heap(begin(numbers), end(numbers));

for (int number : numbers)
{
    std::cout << number << ' ';
}

This code outputs the new arrangement of numbers:

9 8 6 7 4 5 2 0 3 1

Looks familiar? This is our heap implemented as an array!

heap as array C++ STL

Checking for the heap property

Given a collection, it is possible to check whether it is structured as a max heap implemented as an array:

std::is_heap(begin(numbers), end(numbers))

returns true if numbers is a max heap and false otherwise. In the previous case for example it would return false before the call to std::make_heap and true after that.

It is possible that only the beginning of a collection is structured as a heap. In this case std::is_heap_until returns the iterator pointing to the first position of the collection that does not respect the heap property.

auto heapUntil = std::is_heap_until(begin(numbers), end(numbers))

For example, if the collection is a heap, std::is_heap_until returns the end of the collection. And if the first element is smaller than the second one, it returns its first position since the heap property was broken from the start.

Stay tuned for the follow-up of this series. Next up: Building, Unbuilding and Sorting Heaps with the STL!

Related posts:

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed