>
>
How Do Programs Run with All Those Bugs…

Andrey Karpov
Articles: 674

How Do Programs Run with All Those Bugs At All?

Our team analyzes lots of open-source projects to demonstrate the diagnostic capabilities of the PVS-Studio analyzer. After reading our articles, people will often ask: "How does the program run with all those bugs at all?" In this article, I will try to answer this question.

Introduction

For a start, let me say a few words to those of our readers who are not familiar with our tool yet. We develop the PVS-Studio analyzer designed to find bugs in C/C++ source code. The best way to demonstrate its capabilities is to analyze open-source projects to find whatever bugs we can. All the found issues are gathered in a special database. When we find errors that we think to be interesting, we discuss them in articles. For more articles, check our up-to-date articles list.

So, after reading our articles, some readers wonder how programs can work with all those bugs in them at all. I don't think programmers are really surprised by the fact, for every one of them once had to fix a bug that had happily lived in the code for a couple of years. So this article is rather meant for readers who are not closely related to the programming world or for novice programmers only starting their way into it. Nevertheless, I believe experienced programmers may find here some useful and interesting ideas too.

The most important thing

Different code fragments are executed with a different frequency. Some are executed at every run, others just every now and then, and there are also fragments executed on extremely rare occasions.

The rarer a certain code fragment is executed, the easier it is for a bug to hide in it. I'll try to explain it with a number of examples.

If a bug causes a failure when drawing a large button in the program's main window, it will be spotted right away and probably fixed by the programmer themselves before it gets to the testing department.

If a bug depends on input data, it makes it a bit easier for it to hide. Suppose a programmer is developing a graphics editor. He or she has tested the program on images with resolution of 100x100 and 300x400, and although the code is poorly written, it still runs smoothly. Fortunately, the company has testers who have discovered that the program can't work with stretched out images with resolution of 100x10000. As you can see, this bug has survived for a bit longer.

Even more difficult to find are the bugs which require some special conditions for them to occur. Getting back to our graphics editor, suppose it is perfect at handling middle- and large-sized images, but it can't work with 1x1 images which must be processed in a special way, while neither the programmer nor the testers have thought of checking this mode. It results in a program crash when some user accidentally creates a 1-pixel-sized image. This time, the bug has made it through to the end user.

But there are bugs which are even harder to detect. You can find them in code branches responsible for processing various non-standard situations - for example a file saving failure. A bug like that can live inside the program for years, unknown to anyone, and show up only when a user loses some important data because the hard disk has run out of space and the program, instead of warning about this and offering to save the file into a different location as intended, just crashes because of that bug.

Answering the question

Most bugs we detect when analyzing open-source projects with PVS-Studio don't affect the everyday normal work of the program for thousands of users. All of these bugs are found in code fragments which execute very rarely, so nobody has to deal with them.

And it can't be any other way. Take the Qt library, for instance. This library is debugged and tested all through and is used by a vast number of developers in their work. It simply cannot carry any bugs lying on the surface. That's why PVS-Studio only finds bugs which rarely reveal themselves. As an example, take a look at the following function:

QV4::ReturnedValue
QQuickJSContext2DPrototype::method_getImageData(....)
{
  ....
  qreal x = ctx->callData->args[0].toNumber();
  qreal y = ctx->callData->args[1].toNumber();
  qreal w = ctx->callData->args[2].toNumber();
  qreal h = ctx->callData->args[3].toNumber();
  if (!qIsFinite(x) || !qIsFinite(y) ||
      !qIsFinite(w) || !qIsFinite(w))
  ....
}

It contains an error - the 'h' variable is not checked. Instead, the 'w' variable is checked twice. Is it a bug? Yes, it is. But it is very unlikely to show up soon.

First, not all of the applications employing the Qt library ever call the method_getImageData() function. I believe hardly anyone uses it at all, actually. Second, the error will emerge only if the last argument is incorrectly converted (there are no troubles with the other arguments). Third, this last argument must be incorrectly defined. Therefore, this bug is very unlikely to ever show up.

This is why it had survived in the Qt library's code for a long time before it was caught by PVS-Studio. By the way, if you want to learn more about that check, here's the article "Checking the Qt 5 Framework".

Now let's sum it up.

When developing and maintaining programs, developers use a number of methods to test them (unit-tests, regression tests, manual tests, and so on). Explicit errors are quickly eliminated as both the developers themselves and product users discover them very soon.

That's why when we run the PVS-Studio analyzer on some well-known and reliable project such as Chromium, we can only hope to find bugs that almost never reveal themselves.

I mean, bugs are still there - numbers of them (see checks N1, N2, N3, N4). But you are hardly ever to run into them when working in Chromium. It will take an effort, probably a great one, to get to the code branch where the bug is living.

No need to use PVS-Studio then?

Having read to this place, one may rush to draw an incorrect conclusion: "There's no need to use PVS-Studio since it detects insignificant bugs that almost never show up".

Usually when I feel that someone is about to draw this conclusion, I refer them to the article "Leo Tolstoy and static code analysis". But this time, I will try to put the answer once again into different words.

You see, our articles and single-time checks of open-source projects have very little to do with the normal use of the static code analysis methodology. Single-time checks are good for advertising the product, but no more than that. They are of very little use, really. You can only benefit from static analysis when using it regularly.

The worst thing you can do is run your code through an analyzer shortly before the release. It's just pointless. Numbers and numbers of bugs the analyzer could have caught for you have been already found and fixed by the time at the cost of sweat and blood. It's not an efficient way of using the tool (here's, for example, a post about how you can waste 50 hours of working time). After a number of sleepless nights spent working in the debugger and exchanging emails with testers, developers run the analyzer just to get a couple of useful messages. What's the use of it when they have already spent huge amounts of time and effort to find and fix all the most horrible bugs themselves? Isn't it an epic fail?

I'm not exaggerating, mind you. Many developers don't really understand how to use code analyzers properly. We receive plenty of emails reading something like this: "A cool thing. Gonna use it before releasing". That's sad. This is why I'm continuously striving to bring light into the dark land of bugs and programmers unwilling to think.

I'm not claiming the analyzer can find all the possible bugs. It can only find some of them. But it will do it at once - as soon as the programmer has finished writing a new block of code.

The idea about static analysis is that many errors and typos can be caught at the earliest development stage. By using it regularly, you can greatly reduce the time you would otherwise spend on seeking and eliminating defects in your code.

PVS-Studio is needed!

To convince you once and for all, let me cite one real-life example. Recently we were analyzing and fixing bugs in the Unreal Engine game engine. Most of the bugs were non-critical, which is natural because otherwise no one would use Unreal Engine to develop their software.

At some point, we achieved a 0-message output. But the developers naturally continued to modify and edit the code. So the analyzer started to output new warnings on fresh code. We could see literally in real time how new bugs were getting into the code. They would be probably eliminated in a few days, but what for all that trouble when the analyzer can catch them right away? Here's one of those bugs:

static void GetArrayOfSpeakers(....)
{
  Speakers.Reset();
  uint32 ChanCount = 0;
  // Build a flag field of the speaker outputs of this device
  for (uint32 SpeakerTypeIndex = 0;
       SpeakerTypeIndex < ESpeaker::SPEAKER_TYPE_COUNT,    // <=
       ChanCount < NumChannels; ++SpeakerTypeIndex)
  {
    ....
  }
  check(ChanCount == NumChannels);
}

Instead of the && operator, the programmer accidentally wrote a comma. You can't call this bug insignificant. It got into the version control system and, I'm sure, would have caused a lot of troubles but for PVS-Studio which had been alert and caught it.

To find out more about our experience of working with the Epic Games company, please see the article "How the PVS-Studio Team Improved Unreal Engine's Code".

Conclusion

I suggest that you go and try our PVS-Studio code analyzer on your project right now. You can download it from here. The tool's interface is pretty simple, but I recommend that you check the article "PVS-Studio for Visual C++" for useful tips on how to use the analyzer. For example, few users know how to easy and quickly exclude third-party libraries from analysis.

If you use makefile or your own build system to build your projects, you can use the PVS-Studio Standalone application to analyze them. This tool uses a mechanism for monitoring compiler launches.

If you are scared of a huge number of diagnostic messages output at the first runs, try our new message marking mode which makes use of a special database. To put it briefly, the idea is that all the messages are considered irrelevant and are not displayed in the message output window; you will only see messages on fresh code. I'm sure you will learn very soon how convenient and useful static code analysis is when used regularly. To learn more, see the article "Integrating Static Analysis into a Project".

Good luck!