Webinar: Parsing C++ - 10.10
When someone mentions "statistical code analysis" in comments or in conversation, they probably misspeak and mean "static code analysis." However, the statistical analysis does exist. This is one of the error detecting methods that the static code analyzers apply.
The analyzer gains statistics about some code artifacts and uses it to detect anomalies. Let's see what this might look like in real cases.
C and C++ programmers sometimes forget that numeric literals beginning with 0 are octal. A programmer may accidentally (or to align the code) write 0020 instead of 20. However, they do not consider that 0020 is an octal constant equal to the number 16 in the decimal notation.
To prevent such errors, some coding standards don't allow programmers to use octal constants. The examples are MISRA C:2012 (the MISRA-C-7.1 rule) and MISRA C++:2008 (the MISRA-CPP-2.13.2 rule). Therefore, if you use the MISRA standard, the PVS-Studio analyzer enables you to detect octal constants using the V2501 diagnostic rule.
However, this is a very "strict" rule to follow in common programming practice. Overall, using an octal constant is not an error. Moreover, sometimes it's handy. After all, octal constants didn't occur in a programming language for no reason.
It turns out that if the analyzer warns about all octal constants in the code, it will be a bad idea. On the other hand, developers sometimes make mistakes when using these constants. That's where statistics can help.
If a developer uses a single octal constant in a code block, it might be an error. If there are multiple octal constants, they are probably intentional. As a consequence, the analyzer may issue useful warnings only for some constants.
These are the principles behind the V536 diagnostic rule in PVS-Studio. Here's the example of an error from the Chromium (C++) project:
// Coefficients used to convert from RGB to monochrome.
const uint32 kRedCoefficient = 2125;
const uint32 kGreenCoefficient = 7154;
const uint32 kBlueCoefficient = 0721;
const uint32 kColorCoefficientDenominator = 10000;
Pay attention to the 0721 constant (465 in the decimal notation). Zero is obviously redundant. All the coefficients—red, green, and blue—shall add up to 10,000. However, since an octal number is accidentally used, the value of the blue coefficient will be incorrect.
Another example of the diagnostic rule is V1071. Here, PVS-Studio also uses the statistical method. The warning indicates that the return value of the function is ignored. However, in most cases, the result of the function is used in some way.
Here's a synthetic example (C++):
int foo();
....
auto res = foo();
....
if (foo() == 42) { .... }
....
while (foo() != 42) { .... }
....
return foo();
....
foo();
....
Here, the result of the foo function is used four times and then ignored once. If the result is used more than 90% of the total calls, the analyzer will issue a warning for the case where the result is not used.
The analyzer finds it strange that the result is used pretty much everywhere, but in some cases, it is not.
Additional links
0