Our website uses cookies to enhance your browsing experience.
Accept
to the top
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
Statistical code analysis

Statistical code analysis

Jun 06 2024

When someone mentions "statistical code analysis" in comments or in conversation, they probably misspeak and mean "static code analysis." However, the statistical analysis does exist. This is one of the error detecting methods that the static code analyzers apply.

The analyzer gains statistics about some code artifacts and uses it to detect anomalies. Let's see what this might look like in real cases.

C and C++ programmers sometimes forget that numeric literals beginning with 0 are octal. A programmer may accidentally (or to align the code) write 0020 instead of 20. However, they do not consider that 0020 is an octal constant equal to the number 16 in the decimal notation.

To prevent such errors, some coding standards don't allow programmers to use octal constants. The examples are MISRA C:2012 (the MISRA-C-7.1 rule) and MISRA C++:2008 (the MISRA-CPP-2.13.2 rule). Therefore, if you use the MISRA standard, the PVS-Studio analyzer enables you to detect octal constants using the V2501 diagnostic rule.

However, this is a very "strict" rule to follow in common programming practice. Overall, using an octal constant is not an error. Moreover, sometimes it's handy. After all, octal constants didn't occur in a programming language for no reason.

It turns out that if the analyzer warns about all octal constants in the code, it will be a bad idea. On the other hand, developers sometimes make mistakes when using these constants. That's where statistics can help.

If a developer uses a single octal constant in a code block, it might be an error. If there are multiple octal constants, they are probably intentional. As a consequence, the analyzer may issue useful warnings only for some constants.

These are the principles behind the V536 diagnostic rule in PVS-Studio. Here's the example of an error from the Chromium (C++) project:

// Coefficients used to convert from RGB to monochrome.
const uint32 kRedCoefficient = 2125;
const uint32 kGreenCoefficient = 7154;
const uint32 kBlueCoefficient = 0721;
const uint32 kColorCoefficientDenominator = 10000;

Pay attention to the 0721 constant (465 in the decimal notation). Zero is obviously redundant. All the coefficients—red, green, and blue—shall add up to 10,000. However, since an octal number is accidentally used, the value of the blue coefficient will be incorrect.

Another example of the diagnostic rule is V1071. Here, PVS-Studio also uses the statistical method. The warning indicates that the return value of the function is ignored. However, in most cases, the result of the function is used in some way.

Here's a synthetic example (C++):

int foo();
....
auto res = foo();
....
if (foo() == 42) { .... }
....
while (foo() != 42) { .... }
....
return foo();
....
foo();
....

Here, the result of the foo function is used four times and then ignored once. If the result is used more than 90% of the total calls, the analyzer will issue a warning for the case where the result is not used.

The analyzer finds it strange that the result is used pretty much everywhere, but in some cases, it is not.

Additional links

Popular related articles


Comments (0)

Next comments next comments
close comment form