Include What You Use

Published January 1, 2021

I’ve used the clang based include-what-you-use tool on a fairly large chunk of code — a couple of hundreds of files, containing dozens of includes each.

That was an interesting experiment.

Here are my takeaways on this powerful tool, what it can bring to your code, and a few things I wish I had known when I started using it.

include-what-you-…what?

include-what-you-use is a clang-based library that reworks the #includes sections of a C++ file, be there a header or a .cpp file.

The tool has two goals: make sure that each file:

#includes all the headers that it uses, meaning all headers that define or declare a symbol that is used by the including file.
and doesn’t #include any unnecessary header, meaning any header that defines or declares symbols that are not used by including file.

The first goal correspond to the name of the tool, “include what you use”, and the second one could be called “use what you include”.

Said differently, include-what-you-use makes sure that your headers and .cpp files include everything they need and nothing more.

The benefits of having clean header inclusions

There are multiple benefits in having such clean header inclusions.

Design benefit

One of them is that it gives you better vision of dependencies between files. After executing the cleaning with the tool, nearly every remaining (or added) #include represents a dependency (I say nearly because some #includes don’t count as dependencies: for example a class implementation file that #includes its own header file).

Indeed, if a file needs a #include, it means that it uses the code in that #included file. That’s a dependency.

Before cleaning the header inclusions, some #includes may not be necessary. They can be remains of old developments, whose code has be deleted or refactored away to other files modules. Indeed, when changing code, it’s easy to forget to update the #includes.

Those remaining useless #includes create a shallow dependency: a dependency of compilation, because the compiler (or rather, the preprocessor) executes the inclusion, but not a design dependency, because no code really depends on that inclusion.

On the other hand, there can be symbols that the code of a file uses and that are not in the #includes of that file. This happens if those symbol are defined in files that are indirectly included. In this case, the #include section doesn’t give the full picture of the dependencies of the file.

After the header cleanup, you can see the exact dependencies of a given file.

Seeing dependencies is valuable because it is a good start for refactoring: if you see a dependency that doesn’t make sense, then you can work towards removing it. This helps improve the design and architecture of the code, which makes it easier to understand.

Other benefits

Another interesting benefit in cleaning header inclusions is that it can reduce them, and therefore reduce compilation time. Indeed, if you change a header that is #included by many files, re-building your project takes time as it involves to recompile a large amount of files.

Removing useless inclusions can therefore reduce compilation time, without changing the outcome of the compilation. The compiler just stops making unnecessary work.

Another benefit of cleaning up is that clean headers are self-inclusive. This means that if you were to compile them on their own, they would compile without errors, and in particular without missing files.

In fact, self-inclusive headers is more a necessity that a benefit. Without self-inclusive header, you can’t #include headers in any order, because they depend on the #includes of each other.

Without self-inclusive errors, you can get weird problems, such as changing a header and having compilation errors popping up in an unrelated file that you then need to fix because it wasn’t self-inclusive and relied on the header you’re changing.

The errors generated by the cleaning

Although a powerful tool, include-what-you-use isn’t a perfect one, as some files no longer compile after cleaning.

I haven’t see a recurring pattern but here are some of the errors I saw:

two namespaces having the same symbol got mixed up once,
a declaration was #included instead of a definition,
a given file was not #included where it was needed.

It may just be me incorrectly configuring the tool, or it may be bugs in the tool. It doesn’t matter that much, as those were very sparse errors in comparison with the volume of code that the tool treated correctly.

But what is useful to know is that sparse errors can generate a very, very large volume of error messages. Indeed, if those errors happen to be located in central header files, then the errors get generated in many compilation units.

As a result, the amount of errors messages can be daunting at first sight.

Treating errors

The best way I’ve found to treat those errors is to be very methodic.

First, group the errors by file. Maybe your IDE will do it for you, or if you get a raw output from your compiler you can put them into a Pivot Table in Excel in order to extract the file names and count duplicates.

Removing duplicate errors ensures that you won’t see the same error more than once. In my case, one single wrong include was responsible of more than half of the error messages! Fixing it took a few seconds, and it reduced the number of errors left to treat by two. This is energizing.

Taking care of the errors file by file also allows to accelerate the fixes, because you won’t have to jump from one file to another all the time.

All in all, it took me little time to go over the remaining changes to make after the tool ran, and all this experiment had a dramatic effect on the header inclusions of the files.

Make your code include what it uses

In conclusion, I recommend that you try include-what-you-use on your code, in order to clarify its dependencies, improve its compilation time and ensure that headers are self-inclusive.

When you do, please leave a comment here to let me know how that went, and if you have additional advice about how to use the tool efficiently.

And if you already tried it, please let us know about your experience now!