Pour obtenir une clé
d'essai remplissez le formulaire ci-dessous
Demandez des tariffs
Nouvelle licence
Renouvellement de licence
--Sélectionnez la devise--
* En cliquant sur ce bouton, vous acceptez notre politique de confidentialité

Free PVS-Studio license for Microsoft MVP specialists
To get the licence for your open-source project, please fill out this form
** En cliquant sur ce bouton, vous acceptez notre politique de confidentialité.

I am interested to try it on the platforms:
** En cliquant sur ce bouton, vous acceptez notre politique de confidentialité.

Votre message a été envoyé.

Nous vous répondrons à

Si vous n'avez toujours pas reçu de réponse, vérifiez votre dossier
Spam/Junk et cliquez sur le bouton "Not Spam".
De cette façon, vous ne manquerez la réponse de notre équipe.

Feelings confirmed by numbers

Feelings confirmed by numbers

30 Aoû 2012

For a long time I was worried by some articles on the Internet in which the authors tried to judge about the usefulness of static code analyzers relying on analysis of small projects.

In many of those articles I've read their authors made the linear dependence. If static analysis detects 2 errors in a project of N lines, then it will detect only 200 errors in a project of N*100 lines. A conclusion is drawn from this that static analysis is certainly good but not great - it finds too few errors and it's better to develop other methods of bug detection.

There are two reasons why people try code analyzers on small projects. First, a large project is not that easy to check: you have to set some options, define certain data where necessary, exclude some libraries from analysis, and so on. One naturally feels reluctant to do this all. You want to check something quickly, not bother with the settings. Second, a huge number of diagnostic messages will be generated for a large project. Again, nobody wants to spend much time on checking them up. It's much easier to take a smaller project for analysis.

Consequently, one doesn't touch a large project one's currently working on and takes something small instead. For example, it can be his/her old term project or a small open-source project from GitHub.

One checks it and resorts to linear interpolation to determine how many errors can be found in his/her large project. Then one writes an article on this research.

At first sight, such researches look right and useful. But I was sure they weren't.

The first defect of all these researches is obvious. People forget that they take an already fine-tuned version of a project that works well. Many of those errors that could be found by static analysis were being searched for a long time and with great sadness - during testing or after users' complaints. That is, people forget that static analysis is a tool to be used regularly, not occasionally. Programmers study the warnings they get from their compiler every time, not once in a year, don't they?

The second defect of these researches is a bit more complicated and interesting. I had a clear feeling that small projects and large projects cannot be estimated equivalently. Suppose a student has spent 5 days to write a good term project of 1000 code lines. I'm sure he/she won't be able to write a good commercial application of 100 000 code lines in 500 days. The growing complexity will slow him/her down. As an application gets larger, it becomes harder to add new functionality into it and you need more time to test it and handle appearing errors.

So, I had that feeling but didn't know how to pose it. Suddenly one of our co-workers helped me. Studying the book "Code complete" by Steve McConnell, he noticed an interesting table there which had completely slipped my memory. This table puts everything in their places!

Of course, it is incorrect to estimate the number of errors in large projects when you deal with small ones! They have different error densities!

The larger a project, the more errors it contains per 1000 code lines. Look at this wonderful table:


Table 1. Project size and typical error density. The book refers to the following sources: "Program Quality and Programmer Productivity" (Jones, 1977), "Estimating Software Costs" (Jones, 1998).

To make the figures clearer let's draw the diagrams.


Diagram 1. Typical error density in a project. Blue indicates the maximum number of errors; red - the medium number of errors; green - the minimum number of errors.

Now that you can study these diagrams, you see that the dependency is not a linear one, don't you? The larger a project, the more chances for you to make a mistake in the code.

Of course, static analyzers cannot catch all the errors. But the efficiency of the analyzer grows according to project size. And to make it even more efficient you should use it regularly.

By the way, you may not find any errors at all in a small project. Or there will be just a couple of them. Conclusions you may draw in such a case can be absolutely wrong. That's why I strongly recommend that you try different error detection tools on real working projects.

Yes, it is a harder task, but you will get a proper view of the tool's capabilities. For instance, as one of the PVS-Studio's authors I promise you that we try to help everyone who contacts us. If you have any troubles while trying PVS-Studio, please write to us. Many issues can often be solved by properly setting the tool.


I invite you to follow me on Twitter: @Code_Analysis. I regularly post links to interesting articles on the following subjects there: C/C++, static code analysis, optimization and other interesting subjects related to programming.

Popular related articles
Intermodular analysis of C and C++ projects in detail. Part 2

Date: 14 Jul 2022

Author: Oleg Lisiy

In part 1 we discussed the basics of C and C++ projects compiling. We also talked over linking and optimizations. In part 2 we are going to delve deeper into intermodular analysis and discuss its ano…
Intermodular analysis of C and C++ projects in detail. Part 1

Date: 08 Jul 2022

Author: Oleg Lisiy

Starting from PVS-Studio 7.14, the C and C++ analyzer has been supporting intermodular analysis. In this two-part article, we'll describe how similar mechanisms are arranged in compilers and reveal s…
How to introduce a static code analyzer in a legacy project and not to discourage the team

Date: 20 Jui 2020

Author: Andrey Karpov

It is easy to try a static code analyzer. But it requires skills to introduce it in the development of an old large project. If the approach is incorrect, the analyzer can add work, slow down develop…
A note of caution about using PVS-Studio on godbolt.org (Compiler Explorer)

Date: 08 Jui 2020

Author: Andrey Karpov

We have added an option allowing you to experiment with the PVS-Studio static analyzer on the godbolt.org (Compiler Explorer) website. It supports analysis of C and C++ code. We believe this to be an…
Machine learning in static analysis of program source code

Date: 16 Jan 2020

Author: Andrey Karpov, Victoria Khanieva

Machine learning has firmly entrenched in a variety of human fields, from speech recognition to medical diagnosing. The popularity of this approach is so great that people try to use it wherever they…

Comments (0)

Next comments
Unicorn with delicious cookie
Nous utilisons des cookies pour améliorer votre expérience de navigation. En savoir plus