>
>
>
In search of uninitialized class members

Andrey Karpov
Articles: 674

In search of uninitialized class members

We've already got several requests from our clients (including potential ones) to implement diagnostics that could help search for uninitialized class members. We were quite reluctant to do that as we were aware of the difficulty of the task, but finally we gave in. As a result we've come up with V730 diagnostics. I should say right away, that it's not perfect and I already foresee a number of letters directed to us with complaints about something working incorrectly. That's why I've decided to write a note about technical complexity of this task. I hope this information will give answers to the questions of PVS-Studio users and in general will be beneficial to our readership.

Usually a person thinks of some simple cases meditating on the topic of searching for uninitialized class members. Let's say there are 3 members in a class. We have initialized two of them and forgot about the third. Something like this:

class Vector
{
public:
  int x, y, z;
  Vector() { x = 0; y = 0; }
};

If only everything was so simple and everybody used only classes like this. In reality sometimes even a human being cannot tell if the code contains an error or not. In the analyzer's case the problem becomes unsolvable at all. Let's have a look at some of the reasons why the analyzer can issue false positives or skip real errors.

Firstly I would bring a point that class members can be initialized in various ways. It's hard to enumerate all of them. While you are looking at the unicorn, try to come up with a number of methods of class member initialization. Have you got it? Then, let's continue.

Figure 1. Unicorn is trying to tell fortunes, if the class member is initialized or not.

Some simple ways of initialization:

  • To assign value to a class member: A() { x = 1; }.
  • To use an initialization list: A() : x(1) {}
  • To use access through 'this': A(int x) { this->x = x; }
  • To use access through "::": A(int x) { A::x = x; }
  • To use initialization in the C++11 way: class A { int x = 1; int y { 2 }; .... };
  • Initialize a field by means of functions of a function like memset(): A() { memset(&x, 0, sizeof(x); }.
  • Initialize all the class fields (oh, yes, sometimes they do it) with the help of memset(): A() { memset(this, 0, sizeof(*this)); }
  • To use constructor delegation (C++11): A() : A(10, 20) {}
  • To use a special initialization function: A() { Init(); }
  • Class members can initialize themselves: class A { std::string m_s; .... };
  • Class members can be static.
  • You can initialize a class explicitly calling another constructor: A() { this->A(0); }
  • You can call another constructor, using 'placement new' (programmers can be very inventive at times) : A() { new (this) A(1,2); }
  • You can indirectly initialize the members with the help of a pointer: A() { int *p = &x; *p = 1; }
  • And with a reference: A() { int &r = x; r = 1; }
  • You can initialize members if they are classes by calling their functions : A() { member.Init(1, 2); }
  • You can "gradually" initialize members, which are structures: A() { m_point.x = 0; m_point.y = 1; }
  • There are plenty other ways.

As you see, there is a great deal of ways to initialize class members that you have to take into account and on top of it you have to foresee them!

And this list is far from being complete.

The main difficulty is in calling initialization functions which in their turn call another functions and it can go forever. Sometimes it's very hard to track the call graph and at times it's just impossible.

But even if you'll know about each and every method of class initialization, it won't be enough. Absence of initialization in some classes isn't always an error. A classic example - implementation of a container. You can come across such code:

class MyVector
{
  size_t m_count;
  float *m_array;
public:
  MyVector() : m_count(0) { }
  ....
};

Variable m_array is not initialized, but it doesn't matter. In the beginning the class stores 0 elements, that's why memory for the array is not allocated. Thereafter, the m_array is not initialized. It will be initialized later, when the container has at least one element.

The code is correct, but the analyzer will issue a false positive which will probably make a programmer sad. But what can be done about it (about false positives, not programmer sadness) is still not clear.

Probably, to be on the safe side, you should initialize m_array with a nullptr value. But the programing style is a discussion that goes beyond the limits of a small article like this. In practice, it doesn't matter much if in the constructor not all class members are initialized. The code can work quite correctly without initializing some parts. Here I gave a simplified example, there are way more complicated cases.

And now couple of words about duality of our world. Have a look at some abstract code fragment:

class X
{
  ....
  char x[n];
  X() { x[0] = 0; }
  ....
};

Is there an error that in the X class only 1 element is initialized? It's impossible to answer. Everything depends on the type of class X. And the analyzer can't understand this, only a human being.

If this is some string class, then there is no error.

class MyString
{
  ....
  char m_str[100];
  MyString() { m_str[0] = 0; }
  ....
};

We write a terminal null in the beginning of the string. Doing this, the programmer shows that the string is empty. All other array elements can work without initialization and the code is correct.

If this is a color class, then there'll be an error here.

class Color
{
  ....
  char m_rgba[4];
  Color() { m_rgba[0] = 0; }
  ....
};

Here we have only one array element initialized, while all of them should have been initialized. By the way, in this case, the analyzer will think that the class is fully initialized and won't issue a warning (false negative). We have to make it "keep silent" otherwise it will generate too much noise.

So, you see how ambiguous it is? It's very hard to tell where there is an error and where there isn't. We had to do a lot of empiric testing where we tried to guess if the code is correct or not. Of course, it will fail sometimes, which we wanted to apologize for in advance. But now, I hope it became clearer why it is so difficult to search for uninitialized class members and so important to be indulgent to PVS-Studio.