>
>
>
How PVS-Studio prevents rash code chang…

Andrey Karpov
Articles: 643

How PVS-Studio prevents rash code changes, example N5

The PVS-Studio static analyzer encompasses the symbolic execution mechanism. And today we have a great opportunity to demonstrate how this feature helps find errors.

Our system regularly monitors the Blender project and emails me a daily report about potential errors in new or changed code. I don't write a note for each error the system detects. This many notes would probably spam our blog. Today's case, however, is different.

The PVS-Studio static analyzer uses many technologies to find bugs and potential vulnerabilities.

Symbolic execution enables the analyzer to evaluate expressions when values for variables are unavailable. Sounds mysterious, doesn't it? Don't fret, below we'll examine a practical example, and everything will become clear. Let's take a look at this commit in the Blender project.

The analyzer reports a problem in the 868th code line:

memset(&path->ptr[i], 0, sizeof(path->ptr[i]) * (path->len - i));

The analyzer finds it suspicious that the memset function does not fill the memory:

[CWE-628] V575: The 'memset' function processes '0' elements. Inspect the third argument.

Let's figure out how the analyzer came to this conclusion.

The analyzer does not know which numeric values can be stored in the path->len variable. However, the analyzer can work with this variable in another way - I'll elaborate later on as to how.

There's a bit more information about the i variable.

for (int i = 0; i < path->len; i++) {
  ....
  if (i != 0) {
    ....
    memset(&path->ptr[i], 0, sizeof(path->ptr[i]) * (path->len - i));

From the code above, the analyzer can get the following information:

  • The i variable is less than path->len. This data comes from loop analysis.
  • The i variable is greater than 0. The analyzer makes this conclusion from how this variable is first initialized inside the loop and then checked against zero.

Consequently, the possible values of the i variable lie within the range from 1 to path->len.

However, this information is still insufficient to make any conclusions. That's when the symbolic execution mechanism comes to the rescue.

The analyzer sees that, before the memset function call, the path->len variable value changes in the following way:

path->len = i;
if (i != 0) {
  memset(&path->ptr[i], 0, sizeof(path->ptr[i]) * (path->len - i));

The path->len variable value equals i. This mechanism makes it possible for the analyzer to evaluate expressions without knowing the ranges of possible variable values. When working with such expressions, the analyzer makes a substitution:

sizeof(path->ptr[i]) * (i - i)

And gets zero as the function's third argument:

sizeof(path->ptr[i]) * 0

This is obviously an anomaly, and PVS-Studio reports this problem to the developers. What we see here is some kind of an error someone made when editing code. It's pretty cool that developers — if they are using a static analysis tool — can notice such issues quickly and fix them right then and there.

Note. Since this article lists only a small code fragment, the path->len = i assignment can seem very strange. This would mean that the loop always ends after the first iteration. However, in the project, the code fragment we're discussing in this article is placed under conditions and such code makes sense. Here you can examine the loop's entire code.

Previous posts: