Jonathan Boccara's blog

The Results of the Expressive C++17 Coding Challenge

Published October 23, 2017 - 14 Comments

Expressive C++17 coding challenge

The Expressive C++17 coding challenge has come to an end after being open for three weeks, for submissions of the clearest code using C++17.

It was a joint challenge between Bartek’s coding blog and Fluent C++, and its point was to learn collectively how to use C++17 to write clearer code.

We’ll see the winner and his solution in a moment, but frankly, if you have submitted a working solution to the challenge at all, you can consider yourself amongst the winners. I know it sounds a little mushy but each of the 11 solutions that we reviewed was at least 100 lines of code (going up to 500) with C++17 features thoughtfully crafted in. That takes time and effort! So a big thank you for participating to all of you, and we do hope that you had fun and learned stuff in the process.

Just as a reminder, here was the task proposed in the challenge.

The Challenge

The task proposed in the challenge was to write a command line tool that takes in a CSV file, overwrites all the data of a given column by a given value, and outputs the results into a new CSV file.

More specifically, this command line tool should accept the following arguments:

  • the filename of a CSV file,
  • the name of the column to overwrite in that file,
  • the string that will be used as a replacement for that column,
  • the filename where the output will be written.

For instance, if the CSV file had a column “City” with various values for the entries in the file, calling the tool with the name of the input file, City, London and the name of output file would result in a copy of the initial file, but with all cities set equal to “London”:

Here was how to deal with edge cases:

  • if the input file is empty, the program should write “input file missing” to the console.
  • if the input file does not contain the specified column, the program should write “column name doesn’t exists in the input file” to the console.

In both cases, there shouldn’t be any output file generated.

And if the program succeeds but there is already a file having the name specified for output, the program should overwrite this file.

The goal of the challenge was double: using as many C++17 features as possible (as long as they were useful to solve the case), and write the clearest code possible with them.

The winner

Our winner is Fernando B. Giannasi, from Brazil! Congratulations Fernando!!

I’m not […] a professional programmer.

Maybe you’d expect Fernando to be a professional C++ developer. At least when we looked at his code, we thought he was. So we were really surprised when we reached out to him, since Fernando is in fact… a doctor! He is an intensivist, which means that he works in an ICU as an Emergency Physician.

Here is his story that led him to C++:

“I’m a Linux enthusiast since the 90’s, which in an almost natural way led me to be interested in programming.

I have a strong background on shellscript and Python, which I have also used for data analysis.

The first contact I had with (mostly) C and C++ was before college, about 15 years ago, and it did not suit my needs, since I often found myself struggling with awkward syntax and details/constraints from the language rather than the real problem I was trying to solve. So with Python I went some years after…

But few years ago I was working with Raspberry-Pi projects and I felt the lack of performance of my approach using Python and Bash scripts, and I decided to give C++ another try.

Man, what a different language!!

All the algorithms I liked were there on the STL… And the containers, the performance, RAII, everything feels so natural that I never turned back.”

A nice story, isn’t it?

His solution

Let’s get into the details of Fernando’s solution:

Here is the main() part:

try 
{
   if (argc != 5) { throw runtime_error("Bad arguments"); }

   auto [in_file, out_file] = get_file_handlers(argv[1], argv[4]);

   string_view new_value = argv[3];
   auto target_index = get_target_column(in_file, argv[2], ',');
   if (target_index) {
       do_work(in_file, out_file, *target_index, new_value, ',');
   }
   else {
       throw runtime_error("Column name doesn’t exist in the input file");
   }
}
  • The code reads the input data from argv.
  • Opens the files, input and output
  • Finds the target column (the return value is optional<int>)
  • If the column index was found we get into the transform code that does all of the replacement.
  • There’s a structured binding that abstracts the main code from file “streams” – we only see and function that takes in the program arguments and extracts an in_file and an out_file.

Let’s get into the get_target_column function:

[[nodiscard]] optional<int> get_target_column(ifstream& input,
                                             const string_view& label,
                                             const char delimiter)
{
    auto tokens = split_string(first_line, delimiter);
   
    if (auto it = find(begin(tokens), end(tokens), label); // Init-statement for if/switch
        it == tokens.end()) {
       return {}; //return empty optional
    }
    else {
       return distance(begin(tokens), it);
    }
}
  • it reads the first line from the input file and then splits the string into tokens (using a delimiter),
  • returns an index if found something, using the augmented if statement of C++17,
  • [[nodiscard]] will remind you to actually use the return value somewhere. See Bartek’s post for more on C++17 attributes.

And here is the code that splits the string (the line):

[[nodiscard]] auto split_string(const string_view& input, const char delimiter) 
{
   stringstream ss {input.data()};
   vector<string> result;
   
   for (string buffer; 
        getline(ss, buffer, delimiter);) 
           {result.push_back(move(buffer));}
   
   return result;
}

Finally, the core part of the transformation:

string buffer;
   
getline(input, buffer); // for the header line
output << buffer << endl;

while (getline(input, buffer)) {
   auto tokens = split_string(buffer, delimiter);
   tokens[target_index] = new_value.data();
   
   for (auto& i: tokens) {
       output << i;
       output << (i == tokens.back() ? '\n':delimiter);
   }
}

And that’s it. Here is the complete solution file if you want to play around with it.

I tried to write a solution that my wife or my son could easily understand…

The reason why Bartek and I retained Fernando as a winner is the fact that his code was so straightforward and easy to read, and how he used C++17 features to achieve that, as you can see above.

Other solutions

Of course there were plenty of other possible approaches to write code that solved the case. In particular, we recommend that you also have a look at those solutions:

  • The solution of William Killian (previous winner of the Pi Day Challenge), who managed to fit in more C++17 features,
  • The solution of Simon, who solved the problem by creating a token_iterator and a line_iterator, which made the code probably more apt to manipulate CSV in general. It wasn’t in the requirements of the challenge, but it looks like an extensible solution, and this is valuable.

Let’s keep learning

A big thank you to all those who submitted a solution, and also to those who considered doing it but didn’t have the time, and to the all people who encouraged us for this challenge! It was a great experience to review your submissions, and we learned a lot from you.

If you want to learn more about C++17, Bartek has crafted a series of blog posts covering a fair amount of new features and showing how they can be useful.

To conclude, let me share with you an observation of our winner, Fernando:

If C++ has the reputation for being too much “expert friendly”, it does not need to be that way, especially if you keep simple things simple.

Happy learning, happy coding.

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed