Our website uses cookies to enhance your browsing experience.
Accept
to the top
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

Webinar: Parsing C++ - 10.10

>
>
>
Pattern-based analysis

Pattern-based analysis

Jul 29 2024

This method refers to the detection of pattern-based errors. For example, when the variable is assigned to itself:

acx->window_size = acx->window_size;

Although these errors are obvious, and the diagnostic rules to detect them are usually simple, a static analyzer is still needed to search for them. We have given an example of non-synthetic code above. The PVS-Studio analyzer has detected the bug in the Linux Kernel. Let's take a look at this:

int wl12xx_acx_config_hangover(struct wl1271 *wl)
{
  ....
  acx->recover_time = cpu_to_le32(conf->recover_time);
  acx->hangover_period = conf->hangover_period;
  acx->dynamic_mode = conf->dynamic_mode;
  acx->early_termination_mode = conf->early_termination_mode;
  acx->max_period = conf->max_period;
  acx->min_period = conf->min_period;
  acx->increase_delta = conf->increase_delta;
  acx->decrease_delta = conf->decrease_delta;
  acx->quiet_time = conf->quiet_time;
  acx->increase_time = conf->increase_time;
  acx->window_size = acx->window_size;         // <=
  ....
}

The developer didn't notice the error while they monotonously wrote similar pieces of code. When developers review code, such errors can slip away because people quickly lose focus. The tireless static code analysis comes to the rescue!

Such simple errors aren't as rare as they may seem, and can lie in the code of well-known projects. You can see some other examples in our bug collection.

As mentioned above, the pattern-based analysis is simple enough. However, developers rarely use it on its own because it requires additional info, for example, about types. It helps reduce the number of false positives.

Regular expressions can detect some errors, but the analysis quality will be poor because there will be many false negatives and false positives. As a result, some errors won't be detected, and the analyzer will issue false positives about errors where there are none.

Look at the example of searching for repeated conditions in the if-else-if constructions:

if (A < B) { .... }
else if (B > A) { .... }

The second condition is always false because it duplicates the first. This is a classic typo pattern — here you can check examples from real applications.

If we use regular expressions to search for repeated conditions, it becomes a very challenging task because we have to consider many options of how this error pattern may behave:

if (A < B) ....
else if (A < B) ....

if (A & B == 0) ....
else if (0 == B & A) ....

if (A < B) ....
else if (x == y) .... // mid-checks
else if (A < B) ....

As a result, a small part of such typos is detected.

It's even more complicated than that. The static analyzer should handle extra data. For example, it should check that operators aren't overloaded, or that there is a variable change in the condition:

if ((A = get()) < B) ....
else if ((A = get()) < B) .... // no need to issue a warning

So, regular expressions aren't always the best approach, even in seemingly simple cases. So, while regular expressions are used for some static analysis cases, their scope is extremely limited.

Let's look at the PVS-Studio code analyzer, or rather, a core for analyzing C and C++ code. It uses regular expressions in only 5 out of about 700 diagnostic rules (at the time of writing the terminology).

The pattern-based analysis in PVS-Studio and other modern analyzers is designed upon searching detecting patterns and regularities while traversing the syntax tree. The data about control flows, data flows, and so on are needed to enhance the analysis quality.

Additional links

Popular related articles


Comments (0)

Next comments next comments
close comment form