Jonathan Boccara's blog

Should structs Have Constructors in C++

Published June 15, 2018 - 10 Comments

Daily C++

I originally wrote this article for Morning Cup of CodingMorning Cup of Coding is a newsletter for software engineers to be up to date with and learn something new from all fields of programming. Curated by Pek and delivered every day, it is designed to be your morning reading list. Learn more.

C++ structs are little bundles that pack a few pieces of data together:

struct MyStruct
{
    Data1 value1;
    Data2 value2;
    Data3 value3;
};

Would a struct benefit from having a constructor? Or are constructors not in the “spirit” of struct? Or would constructors even get in the way?

All those questions can be answered by Yes or by No, depending on what a given struct represents.

Before delving into the “why”, the “when”, the “how” and even the “what else”, let me be more specific about what I mean by a struct. Technically, a struct is like a class, so technically a struct would naturally benefit from having constructors and methods, like a class does.

But this is only “technically” speaking. In practice, the convention is that we use structs only to bundle data together, and a struct generally doesn’t have an interface with methods and everything. So technically, you can replace struct with class in all that follows, but this does not follow the convention of struct and class (which everyone should follow).

So if we consider a struct that only has data, like MyStruct above, in which cases would it benefit from having a constructor?

The advantage of NOT writing a constructor

If a structure is reduced to its bare minium, with no constructor, no method, no inheritance, no private method or data, no member initializer, if a structure only defines public data members, then an special initialization feature of C++ kick in: aggregate initialization.

An aggregate initializer is a set of data between braces, that you can use to initialize the data member of struct. For instance, with this struct:

struct Point
{
    int x;
    int y;
    int z;
};

We can initialize an instance with the following syntax:

Point p = {1, 2, 3};

This instance p then has its x equal to 1, its y equal to 2 and its z equal to 3.

Note that since C++11, we can also write it without the equal sign:

Point p {1, 2, 3};

This initialization is very concise. This implies that, to be expressive, a reader needs to be able to guess from the call site which member is which, without having to go look for the order of definition in the struct.

For the example of a Point, it makes sense, because the order of definition of x first, then y then z is pretty ubiquitous. But if you consider a structure that doesn’t have a natural order, such as this one:

struct CoinFlipResult
{
    int numberOfHeads;
    int numberOfTails;
};

The instantiation code could look like this:

CoinFlipResult result = {49, 51};

It’s not clear which value corresponds to which attribute. We could use strong types instead, to write something like this:

CoinFlipResult result = {NumberOfHeads(49), NumberOfTails(51)};

Which makes the code more explicit.

Now you may think that this debate has nothing to do with aggregate initialization, and that the question of strong typing would be just as relevant for a function:

void displayResult(NumberOfHeads numberOfHeads, NumberOfTails numberOfTails);

But there is something specific to the combination of strong types and aggregate initializers here: if you use strong types in them, then you have to use strong types in the members of the struct too. Indeed, in an aggregate initializer the data is used directly to build the members of the struct:

struct CoinFlipResult
{
    NumberOfHeads numberOfHeads;
    NumberOfTails numberOfTails;
};

I find this redundant here, because inside of the struct the name of the member identifies it with no ambiguity anyway.

On the contrary, a function offers a level of indirection that can fetch the value inside of the strong type, if you desire to do so.

Custom initialization with a constructor

As soon as you put a constructor in a struct, you forgo aggregate initialization for it. Let’s see in which cases the constructor brings enough value to balance this disadvantage.

Member initializers

Strictly speaking, member initializers are not constructors, but they play a role that used to be filled by constructors before C++11: initializing members with default values:

struct Point
{
    int x = 0;
    int y = 0;
    int z = 0;
};

And in C++11, like “real” constructors, their presence (even if for only one attribute) deactivates aggregate initialization (it’s no longer the case in C++14, thanks to Alexandre Chassany and chris for pointing this out).

Their counterpart is that they guarantee that data members are initialized (reading uninitialized data is undefined behaviour and can make the application crash), and do so with a very concise and expressive syntax.

The C++ Core Guidelines recommend their usage in guideline C.45: “Don’t define a default constructor that only initializes data members; use in-class member initializers instead”.

Construction from another object

One case that comes up often, I find, is when you need a small set of data coming from a larger API, or several ones combined. You don’t want to carry around those APIs in your code, and it’s nice to retrieve the bunch of data that you need from them and store it in a small struct that you carry around in a local part of the code.

One way to go about this is to implement a constructor that takes those bigger objects and fills the struct off them:

struct MyLocalData
{
    Data1 value1;
    Data2 value2;
    Data3 value3;

    MyLocalData(BigObject const& bigObject, LargeAPI const& largeAPI)
    : value1(getValue1(bigObject)
    , value2(getValue2(bigObject, largeAPI)
    , value3(getValue3(largeAPI))
    {}
};

The advantage of this approach is to make it very clear that this struct only represents a simpler, more adapted to your local code, representation of those larger objects. We could also represent this as a class, by making the data private and access it with getters, but then we’d lose the semantics of “these is just pieces of data (value1, value2, and value3) put together”.

We could even go a step further and prevent any other way of filling that data, by making the members const:

struct MyLocalData
{
    const Data1 value1;
    const Data2 value2;
    const Data3 value3;

    MyLocalData(BigObject const& bigObject, LargeAPI const& largeAPI)
    : value1(getValue1(bigObject))
    , value2(getValue2(bigObject, largeAPI))
    , value3(getValue3(largeAPI))
    {}
};

Which also makes the data immutable, and if your code doesn’t need to change that data, immutability makes it easier to reason about.

One issue with this design though, is that is creates a dependency of MyLocalData on BigObject and LargeAPI, which don’t sound like the type of things you’d like to depend on, do they. A practical consequence is that it makes it harder to instantiate the struct in a test harness for example.

Custom initialization without a constructor

To break this dependency we can tear out the constructor from the struct and replace it with a function:

struct MyLocalData
{
    Data1 value1;
    Data2 value2;
    Data3 value3;
};

MyLocalData makeMyLocalData(BigObject const& bigObject, LargeAPI const& largeAPI)
{
    // ...
}

But then we no longer have the semantics that MyLocalData is a sort of summary of the other larger objects.

The possibilities of implementation of makeMyLocalData then span from a very terse aggregate initialization (note that here C++ allows to omit the name of the type if it’s built on the return statement):

MyLocalData makeMyLocalData(BigObject const& bigObject, LargeAPI const& largeAPI)
{
    return {getValue1(bigObject), getValue2(bigObject, largeAPI), getValue3(largeAPI)};
}

…to the very explicit good old member-by-member struct assignment:

MyLocalData makeMyLocalData(BigObject const& bigObject, LargeAPI const& largeAPI)
{
    auto myLocalData = MyLocalData{};

    myLocalData.value1 = getValue1(bigObject);
    myLocalData.value2 = getValue2(bigObject, largeAPI);
    myLocalData.value3 = getValue3(largeAPI);

    return myLocalData;
}

structs and constructors: an “It’s complicated” relationship

Those are trade-offs for putting a constructor into a struct, which should give you some means to weigh your options for a given situation.

To summarize the highlights of our analysis, I’d recommend to:

  • go for aggregate intialization if the members order is obvious,
  • put the constructor inside of the struct if you’re building it off other objects, when the dependency doesn’t become a burden,
  • make an external building function otherwise.

What’s your opinion on this? Do you put constructors in your structs?

You may also like

Don't want to miss out ? Follow:   twitterlinkedinrss
Share this post!Facebooktwitterlinkedin

Comments are closed