How Unit Tests Help Express Your Code’s Intent

Published November 3, 2017 - 2 Comments

Guest writer Tim Scott talks to us about how to make unit tests express the intentions of a piece of code. Tim is a software developer and tester passionate about software quality and automation. You can find him online on DeveloperAutomation.com, his blog about increasing quality and developer efficiency through automation, or on his Twitter or LinkedIn profile.

Unit testing is the practice of writing additional test code in order to exercise your source code. These tests verify the functionality of your program through white-box testing. Much has been written on the benefit of unit testing improving code quality. Here I’d like to dive into an additional benefit: quickly expressing the intent of your code.

unit test intentions testing

At one of my previous jobs, we were starting to write unit tests against our codebase for the first time. After a couple of months of doing this, one of my coworkers made the following comment:

“These unit tests are a good form of documentation. They show exactly how the code is intended to be used.”

Sure enough, I quickly saw unit testing as an additional form of documentation. It does more than just test code. These tests also…

Provide clear examples of how the code is intended to be used
Show the exact inputs and outputs expected for functions
Remain up-to-date if tied into a continuous integration system that runs those tests on every commit

At times, looking at unit test code has instantly given me the proper way to use a common function or class. Rather than spend 5 minutes or so looking at documentation, I can find my exact use case within about 30 seconds of looking at the unit tests. I can then copy-paste that example and modify it for my specific needs.

Recently Bartek and Jonathan posted an expressive C++17 coding challenge. For the sake of writing unit tests, let’s solve this problem again (not particularly with C++17). As we write different sections of this code, we are going to explore how the unit tests clearly express the intent of the code.

The Program We Will Write And Test

unit test intentions testing

The task proposed in the C++17 expressive code challenge was to write a command line tool that takes in a CSV file, overwrites all the data of a given column by a given value, and outputs the results into a new CSV file.

In addition to the original task, I added a few requirements for the purpose of showing more test cases. These differences from the original task will be identified in the following description in italics.

This command line tool should accept the following arguments:

the filename of a CSV file,
the name of the column to overwrite in that file,
the string that will be used as a replacement for that column,
the filename where the output will be written.

For instance, if the CSV file had a column “City” with various values for the entries in the file, calling the tool with the name of the input file, City, London and the name of output file would result in a copy of the initial file, but with all cities set equal to “London”:

Here was how to deal with edge cases:

if the input file is empty, the program should write “input file missing” to the console.
if the input file does not contain the specified column, the program should write “column name doesn’t exists in the input file” to the console.
Additional requirement #1: If the number of command-line arguments is not five (the program name, the input file, the column header, the replacement value, and the output file), the program will throw an exception.
Additional requirement #2: If the number of columns in any row is not the same number of columns as the header, the program will throw an exception.

In any of these cases, there shouldn’t be any output file generated.

And if the program succeeds but there is already a file having the name specified for output, the program should overwrite this file.

One solution

My code for this project can be found on Github.

Here is how to build and run the executables:

make: compile the source code
./colReplacer inputFile.csv columnHeader columnReplacementValues outputFile.csv
make clean: erase the objects and executables
make test: compile the source code (without main.cpp) and test code (with testMain.cpp)
./testReplacer

We will be using the Catch unit testing library. Catch is a C++ unit testing library that allows you to test your code by just including one header file. More documentation on that library can be found here.

Before we see how unit tests express the intent of the code, I want to explain the source code. In order to better understand the tests, we need to have a basic understanding of how this specific solution works. Following this brief explanation, we will look at the unit tests.

Having said that, let’s begin discussing my solution to the code. It is very object-oriented. It may be overkill for this problem, but I want to present the solution as a class that could be reused by other pieces of code. The unit tests for these classes help express their intent and show their requirements.

The main parts of this project are divided into a few different parts:

The src folder (C++ source files)
The include folder (C++ header files)
The test folder (src and include folders for unit testing files)

Most of the work happens in the following files:

CsvArgs.cpp (parses command-line arguments and helps with input/output files)
CsvProcessor.cpp (replaces the column values)

Let’s dive into the code!

unit test intentions testing

Everything starts with a few lines in the main function in main.cpp. Here’s most of the lines from it:

CsvArgs args(argc, argv);
CsvProcessor processor(args.getInputData());
std::string output = processor.replaceColVals(args.getColToOverwrite(), args.getColReplaceVal());
args.setOutputData(output);

The arguments from the main function are parsed by the CsvArgs object. The bulk of the work takes place in the replaceColVals function. Notice how we get input data (which is an istream object – not a file – more on that later) from args and write the output as part of args. The file processing is not done in the CsvProcessor class. This will be important later when we discuss the test cases.

The arguments passed through the command-line are

Input filename
Column header to replace
Replacement value in the column
Output filename

In the description that follows, notice how each of those arguments are used in the four related functions of CsvArgs.

CsvArgs.hpp
- CsvArgs(int argc, char *argv[]); – parses the command-line arguments and puts them in member variables.
- std::istream &getInputData(); – opens the input file if not already open and returns a reference to an input stream.
- void setOutputData(const std::string &data); – opens the output file if not already open and writes the given string to it.
- std::string getColToOverwrite(); – gets the column header to overwrite.
- std::string getColReplaceVal(); – gets the replacement value to place in the columns

CsvProcessor only has one public function (other than its constructor) – the function that replaces the columns.

CsvProcessor.hpp
- CsvProcessor(std::istream &inputData); – the constructor takes the CSV data to replace as an istream.
- std::string replaceColVals(const std::string &colToOverwrite,
  const std::string &replaceVal); – this function replaces the columns in the CSV data and outputs the replacement as a string.

If you care to see more implementation details, you are welcome to look at the .cpp files.

Hopefully you can understand the high-level view of how the program works at this point.

The makefile has options for compiling the source code (what I just described) and the test code. The test code has a different main function that is supplied by the Catch unit testing framework. As a result, it generates a different executable to be run: testColReplacer. This will look no different than compiling or running any other program. The difference will be in the output of the program.

All the tests passed!

Now that we’ve seen what to expect from our testing program, let’s explore the test code… and more importantly, how it can help us express what the source code is doing.

Clarifying intentions through unit tests

A simple test case

We start by defining the main function in testMain.cpp:

#define CATCH_CONFIG_MAIN
#include "catch.hpp"

As I said earlier, Catch supplies its own main function, and we use it in this application.

Easy enough! Now let’s look at an example test case.

TEST_CASE("CsvArgs puts command-line args into member variables")
{
   int argc = 5;
   CsvArgs args(argc, argv);

   REQUIRE(args.getColToOverwrite() == std::string(colToOverwrite));
   REQUIRE(args.getColReplaceVal() == std::string(colReplaceVal));
}

Catch uses several macros that we get when we include its header file. A few that will interest us:

TEST_CASE: starts the code for a test case. It takes as input the name of the test case.
REQUIRE/REQUIRE_FALSE: Makes an assertion that must be true or false. This is the actual testing part.
REQUIRE_THROWS: Makes an assertion that some executable code throws an exception.

Let’s now explore what the previous test case above is doing.

It defines a test case with a given name.
It creates a CsvArgs object.
It makes sure that two member variables match another string.

Given that code, it may or may not be obvious what is being tested. However, we can look at the test case name and immediately know what is being tested:

“CsvArgs puts command-line args into member variables”

Command-line args… that’s what is coming into the program when we run the source code. So it is putting those command-line arguments into CsvArg’s member variables. Looking at the test code, I can see that argc and argv – the arguments from main – go directly into CsvArgs constructor. We can then get those arguments back from CsvArgs.

Perfect! We now know how to write a test case. In addition, we see how the title of that test case can be extremely descriptive in what we are trying to do.

If the spec was lost

I now want you to imagine that this code is legacy code. We need to add a new feature to it. Unfortunately, we don’t have requirements for what the code is supposed to do. I wish I could say this was unusual, but I unfortunately have run into this problem a good bit. How do you know what the code is supposed to do? How do you go about changing it without breaking functionality when you don’t know what its purpose is?

unit test intentions testing

A well written set of unit tests can solve this problem. For example, let’s say we don’t know any of the requirements for the expressive C++ coding challenge. Instead, we have a good set of unit tests. Let’s look at all of the titles of our test cases…

From testCsvProcessor.cpp
- Empty data should throw exception: ‘input data missing’
- Column not found should throw exception: ‘column name doesn’t exist in the input data’
- Different num columns (too few) in input data throws exception: ‘input file is malformed’
- Different num columns (too many) in input data throws exception: ‘input file is malformed’
- replaceColVals replaces all column values with a value
From testCsvArgs.cpp
- CsvArgs constructor throws exception when number of args is not four
- CsvArgs puts command-line args into member variables

If I knew nothing at all about this program… not a single thing, here’s some pieces of information I get from those test case titles alone:

This program takes input data
It works with columns in that input data
It replaces all column values with a value.
It takes in command-line arguments and puts them into member variables (I would assume those member variables get used in the program).

If you have ever worked in legacy code before, you’ll know that this type of information is HUGE! I basically have a list of many if not all of the requirements just from the test case names alone! I also get an idea of what the program’s functionality is. This kind of information goes a very long way to describing what your C++ code does.

In addition, when you make changes to the existing code, you can have more confidence that you aren’t breaking something. If you insert a bug and the unit tests are well-written, you get the added benefit of catching those bugs before they go past the development phase of your project.

Writing Descriptive Test Case Definitions

In order to write really descriptive test cases, you need to write as though the person reading them knows nothing about the code, its purpose, or the requirements. Before we dig into a more detailed test case, let’s cover a few tips in order to write our test cases for this type of reader:

For the inputs to your function, name everything relative to how the test case is testing it (not how it is used in the program). To illustrate, here’s some examples for the “replaceColVals” function (which replaces the columns in this example program):
- replaceColVals("badColHeader", "myval"): I use the column name of “badColHeader” rather than something like “City”. This indicates the intent of the test case… passing in a bad column header.
- std::istringstream inputData("col1,col2,col3\nval1,val2,val3\nthisRow,hasNoThirdCol"): This input data that will be passed to replaceColVals has a header row, a row of data, then another row of data. The last row, rather than saying “val1,val2” says “thisRow,hasNoThirdCol”. So that test case is testing for a row that has too few columns.
- std::istringstream inputData("col1,col2,col3\nval1,val2,val3\nval1,val2,val3,extraCol"): Similar to the above, this input data has an “extraCol”. Note the name, extraCol, rather than naming it “val4”.
For the output to your function, particularly the comparison part, make it as easy to read as possiible. If the output is large (such as a long string), store it in a well-named variable rather than sticking it all on one line within the REQUIRE macro.
Make your test case functions small.
- Smaller test case definitions make it a lot easier to see the intent. If you have a whole lot of setup that is necessary, stick it in another well-named function that the test case calls. Keep the test case itself small.
- You may consider rewriting your source code functions if necessary so they don’t do as much. This usually makes the test cases smaller since not as much setup or input and output is required.
- You will notice that the example test cases in this program all have very small function bodies, which allows one to quickly understand its intent.

A More Detailed Test Case

Let’s look at one more of the test cases in detail – my favorite one of this set – that shows the core functionality of the whole program. It is the “replaceColVals replaces all column values with a value” test case.

TEST_CASE("replaceColVals replaces all column values with a value")
{
   std::istringstream inputData
   (
       "col1," "replaceCol," "col3\n"
       "val1," "val2,"       "val3\n"
       "val1," "val5,"       "val6\n"
   );
   std::string output = CsvProcessor(inputData).replaceColVals("replaceCol", "myval");
   std::string expected_output =
   (
       "col1," "replaceCol," "col3\n"
       "val1," "myval,"      "val3\n"
       "val1," "myval,"      "val6\n"
   );
   REQUIRE(output == expected_output);
}

You can see exactly what the input is. You then see that we replace the “replaceCol” header column with “myVal”. We see the expected output has val2 and val5 replaced with myVal. This is a very clear example of exactly what that function (the core functionality of the program) does. What better way to express what your code is doing? Not only that, but it also will always be up-to-date if you tie it into continuous integration. After every commit, that test could be automatically run. You could also set it up to notify you if the building or testing of that code fails.

There are more unit tests in the test folder that you can view if you are interested. These few examples hopefully have shown how unit tests can be written with very clear titles to help describe what the source code is doing. In addition, the body of these test cases contain examples of how the code is intended to be used.

You can do the same thing in your code projects to take advantage of the expressiveness unit tests can bring to your code. All it takes is a few well-formulated examples of how to use your code and well-defined test case names.

Want more information on how to get started with unit testing? Have questions or comments? I’d love to help or get your feedback!