Jul 07 2021

Static code analyzer issues no warnings (false-negative results)

Jul 07 2021

High-Level Errors
No Such Diagnostic in the Analyzer
Implementation Drawbacks

Tools that perform static code analysis may not detect errors for the following main reasons:

A high-level error that can't be detected at the source code analysis level.
Developers do not know about an error pattern or have not implemented its search in the analyzer.
Some algorithms have flaws and fail to search for errors.

Let's look into these reasons in a little more detail. Interested to learn how static code analyzers work in general? Be sure to check out the article: "Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities".

High-Level Errors

Let's say a technical requirement said that a specific value had to be calculated by the formula "cos(x) / 2". When implementing algorithms, the developer mistyped and wrote sin instead of cos in the program code:

y = sin(x) / 2;

Such program code makes it impossible to find an error like this. Only much knowledge about the program itself may help—just the code fragment is of little use. Static analysis will not work out in this case.

No Such Diagnostic in the Analyzer

One can make a mistake in a great variety of ways. Still the situation is not as desperate as it appears. Typically, programmers' mistakes fit some patterns that you can notice and point out. Such cases serve as a basis for diagnostics that will detect many of such errors. In other words, certain error patterns become notable. Examples include null pointer dereference, buffer overflow, resource leak, and others.

There will always be exotic bugs that none of the existing code analyzers can uncover. There's no need to worry about this. After all, developers should focus primarily on identifying common error patterns. Next, developers should learn to search for little less frequent cases. Thus, developing any static analyzer is an infinite process of approaching an unattainable ideal. Some analyzers have advanced more, some less.

The PVS-Studio team draws inspiration to implement new diagnostics from the following sources:

Our own experience.
Books, articles, talks.
User feedback. By the way, if you have an idea for a new diagnostic, we'll be happy if you share it with us.
Chats on websites, like Stack Overflow.
And others. Interesting example.

Machine learnings is one of the ways to get around manual writing diagnostic rules. The main idea is to teach the analyzer to find errors itself by learning on a large base of open source code. Unfortunately, our team is quite skeptical about this topic. We set out our vision in the article "Machine Learning in Static Analysis of Program Source Code".

Implementation Drawbacks

Diagnostics' implementation may let slip special cases. You can always write code so that an error will hide from the analyzer.

Besides, diagnostics are tied to other analyzer mechanisms. These mechanisms, such as data flow analysis, have limitations too. For example, we're constantly improving data flow analysis in PVS-Studio, but it's an endless process. What's more, some cases are algorithmically intractable.

Additional links: