Include What You Use
I’ve used the clang based include-what-you-use tool on a fairly large chunk of code — a couple of hundreds of files, containing dozens of includes each.
That was an interesting experiment.
Here are my takeaways on this powerful tool, what it can bring to your code, and a few things I wish I had known when I started using it.
include-what-you-…what?
include-what-you-use is a clang-based library that reworks the #include
s sections of a C++ file, be there a header or a .cpp file.
The tool has two goals: make sure that each file:
#include
s all the headers that it uses, meaning all headers that define or declare a symbol that is used by the including file.- and doesn’t
#include
any unnecessary header, meaning any header that defines or declares symbols that are not used by including file.
The first goal correspond to the name of the tool, “include what you use”, and the second one could be called “use what you include”.
Said differently, include-what-you-use makes sure that your headers and .cpp files include everything they need and nothing more.
The benefits of having clean header inclusions
There are multiple benefits in having such clean header inclusions.
Design benefit
One of them is that it gives you better vision of dependencies between files. After executing the cleaning with the tool, nearly every remaining (or added) #include
represents a dependency (I say nearly because some #include
s don’t count as dependencies: for example a class implementation file that #include
s its own header file).
Indeed, if a file needs a #include
, it means that it uses the code in that #include
d file. That’s a dependency.
Before cleaning the header inclusions, some #include
s may not be necessary. They can be remains of old developments, whose code has be deleted or refactored away to other files modules. Indeed, when changing code, it’s easy to forget to update the #include
s.
Those remaining useless #include
s create a shallow dependency: a dependency of compilation, because the compiler (or rather, the preprocessor) executes the inclusion, but not a design dependency, because no code really depends on that inclusion.
On the other hand, there can be symbols that the code of a file uses and that are not in the #include
s of that file. This happens if those symbol are defined in files that are indirectly included. In this case, the #include
section doesn’t give the full picture of the dependencies of the file.
After the header cleanup, you can see the exact dependencies of a given file.
Seeing dependencies is valuable because it is a good start for refactoring: if you see a dependency that doesn’t make sense, then you can work towards removing it. This helps improve the design and architecture of the code, which makes it easier to understand.
Other benefits
Another interesting benefit in cleaning header inclusions is that it can reduce them, and therefore reduce compilation time. Indeed, if you change a header that is #include
d by many files, re-building your project takes time as it involves to recompile a large amount of files.
Removing useless inclusions can therefore reduce compilation time, without changing the outcome of the compilation. The compiler just stops making unnecessary work.
Another benefit of cleaning up is that clean headers are self-inclusive. This means that if you were to compile them on their own, they would compile without errors, and in particular without missing files.
In fact, self-inclusive headers is more a necessity that a benefit. Without self-inclusive header, you can’t #include
headers in any order, because they depend on the #include
s of each other.
Without self-inclusive errors, you can get weird problems, such as changing a header and having compilation errors popping up in an unrelated file that you then need to fix because it wasn’t self-inclusive and relied on the header you’re changing.
The errors generated by the cleaning
Although a powerful tool, include-what-you-use isn’t a perfect one, as some files no longer compile after cleaning.
I haven’t see a recurring pattern but here are some of the errors I saw:
- two namespaces having the same symbol got mixed up once,
- a declaration was
#include
d instead of a definition, - a given file was not
#include
d where it was needed.
It may just be me incorrectly configuring the tool, or it may be bugs in the tool. It doesn’t matter that much, as those were very sparse errors in comparison with the volume of code that the tool treated correctly.
But what is useful to know is that sparse errors can generate a very, very large volume of error messages. Indeed, if those errors happen to be located in central header files, then the errors get generated in many compilation units.
As a result, the amount of errors messages can be daunting at first sight.
Treating errors
The best way I’ve found to treat those errors is to be very methodic.
First, group the errors by file. Maybe your IDE will do it for you, or if you get a raw output from your compiler you can put them into a Pivot Table in Excel in order to extract the file names and count duplicates.
Removing duplicate errors ensures that you won’t see the same error more than once. In my case, one single wrong include was responsible of more than half of the error messages! Fixing it took a few seconds, and it reduced the number of errors left to treat by two. This is energizing.
Taking care of the errors file by file also allows to accelerate the fixes, because you won’t have to jump from one file to another all the time.
All in all, it took me little time to go over the remaining changes to make after the tool ran, and all this experiment had a dramatic effect on the header inclusions of the files.
Make your code include what it uses
In conclusion, I recommend that you try include-what-you-use on your code, in order to clarify its dependencies, improve its compilation time and ensure that headers are self-inclusive.
When you do, please leave a comment here to let me know how that went, and if you have additional advice about how to use the tool efficiently.
And if you already tried it, please let us know about your experience now!
You will also like
- Technical Debt Is like a Tetris Game
- TODO_BEFORE(): A Cleaner Codebase for 2019
- 10 Code Smells a Static Analyser Can Locate in a Codebase
- Mikado Refactoring with C++ Feature Macros
- The Shapes of Code
Share this post!