>
>
>
The further we go, the more exotic the …

Andrey Karpov
Articles: 675

The further we go, the more exotic the errors become

In the times when we were just starting PVS-Studio development, I was able to almost instantly determine the cause of a false positive or an error in the analyzer itself. I was able to momentary isolate the subsystem that was responsible. But time flows by. The system has matured. Then the inevitable came. A user reported a bug in PVS-Studio operations. And it was the first time ever when finding it took us not an hour or a day, but almost a week. Although this is sad, it still is unavoidable. The larger software project becomes, the more complicated interconnections it contains. And it becomes harder to replicate the errors as a result.

In the course of PVS-Studio development, a significant amount of difficulty comes from huge numbers of various combinations of input data. One case is what we see in our own or third-party code. A totally different one is what we can actually encounter in libraries or what can be generated by macros-rich constructs.

Let me explain about the macros. Its heavy use can cause the generation of such an unnatural code that otherwise could not possibly be written manually by a developer. For example, we had a case when a macro generated a line with a length of 2 701 375 characters inside the preprocessed file. As we had never expected such a trick, one of our diagnostic rules assumed that an infinite loop is present and generated an exception. In fact, the error was present inside the mechanism which should've prevented such errors from occurring :)

But at this moment we face a new and rare situation. Header files form the Qt library hold the following code:

inline QModelIndex QAbstractItemModel::createIndex(
  int arow, int acolumn, int aid) const
#pragma warning( push ) 
#pragma warning( disable : 4312 )
{ 
  return QModelIndex(arow, acolumn, 
                     reinterpret_cast<void*>(aid), this);
}

Please note that two #pragma's are situated in-between the declaration of a function and its own body. This is possible as #pragma can be used anywhere, although, in practice, this is actually quite a rare situation.

PVS-Studio was modified for processing such code correctly, that is, without missing body of a function, in the June of year 2011. It was exactly at that moment when the error had been introduced, the error that we had to search for several days.

The error itself is quite a common one. In the specific conditions present, a pointer is stored inside the incorrect variable, and the correct pointer remains a null one. Afterwards, in another section of a program, a null pointer is utilized, for which the consequences are evident. So this is actually your common misprint.

By the way, as you can see, I have enough courage to talk about my blunder in the open. This code was written by me. For some reason, others quite often refrain from mentioning such situations. For instance, read my article here: "Myths about static analysis. The second myth - expert developers do not make silly mistakes". And here I am, frankly admitting it. I've made a primitive and stupid mistake. We were forced to debug it for several days. I am not perfect and I admit it. But, if a static analyzer, such as PVS-Studio for example, can detect at least 25% of such errors, than this is just great! Unfortunately, in this particular case, it was unable to undercover my cunning games with pointers. But nonetheless, quite often it helps us and points our noses to fresh and newly written code. I think it already saved us a significant amount of time which would have been wasted on the debugging otherwise.

This particular error we've committed was active for over a year before a user encountered it and informed us. Several factors should have been met for it to reveal itself. A function containing the #pragma should have been encountered, as was shown in the example above. And not a simple function, but a function belonging to a class. And, most importantly, this file should have been marked as excluded from the analysis.

In PVS-Studio, you can specify the folders for the contents of which the analysis should not be performed. By default, this setting holds such values, as "libpng", "libjpeg" etc. Firstly, this allows suppressing the unnecessary diagnostic warnings for the source code of external third-party libraries, and, secondly, in case a *.h header file is located inside such an excluded folder, we can skip bodies of inline functions altogether. This, in turn, speeds up the analysis a bit.

And here our troubles come from. The analyzer decided to skip the body of a function, but encountered #pragma instead. In theory, this situation should have been handled correctly. But the misprint caused the appearance of a null pointer.

Of course, right now all of it looks quite clear and easy. But it was quite hard to reproduce back then. The thing is, our error could not be reproduced immediately because we had not added a folder containing this file into the exclusions. Nevertheless, I think most developers do understand how something like that can occur...

Conclusions for myself

In the future, I will try harder to reflect upon the creation of tests for newly written code. Actually, there were tests which verified function skipping mechanics. There were also tests verifying the processing of #pragmas in-between function's declaration and body. But there were no complex test for when these situations are present together. As there was no such a test, the issue hadn't revealed itself for more than a year. And, as almost exactly according to McConnell, the time it took us to resolve this issue multiplied by 20 times (see this table). Had only this test been created immediately, the error would have been localized almost at the same time as well.