Data flow analysis is a technique used to obtain information on a possible range of values evaluated at different points in a computer program.
Static analyzers use data flow analysis for in-depth search of various problems in code. Such problems include: SQL injection, array index out of bounds, unreachable code, potential null reference access, and many others.
To better illustrate the capabilities of data flow analysis, we provide the code examples below.
PVS-Studio analyzer found the following error in the ScreenToGif project code:
protected override void OnPreviewKeyDown(KeyEventArgs e)
{
....
if (Text.Length > 8) // <=
{
e.Handled = true;
return;
}
if (Text.Length == 1){....}
else if (Text.Length == 4) {....}
else if (Text.Length == 7) {....}
else if (Text.Length == 10) // <=
Text = Text.Substring(0, 9) + Text.Substring(6, 1).PadLeft(3, '0');
....
}
The PVS-Studio analyzer warning:
V3022. Expression 'Text.Length == 10' is always false.
This is how the data flow analysis mechanism helped detect an always false condition.
The if(Text.Length > 8)
expression with return
in its body limits the Text.Length
possible values after the if
block is executed with the interval [int.MinValue, 7]. During the Text.Length == 10
check, the information obtained earlier indicates that the condition is false because the value 10
is outside the above range. Given all other Text.Length
checks, the values 1
, 4
, 7
are no longer possible ones.
The example with a potential null dereference:
int ExampleOfNullDereference(bool flag)
{
string potentiallyNullStr = flag ? "not null" : null;
return potentiallyNullStr.GetHashCode();
}
First, data flow analysis obtains information about possible values of the potentiallyNullStr
variable from the flag ? "
not null' : null expression. Since
null is among these values, the null reference may be dereferenced during the
GetHashCode() call, causing the exception. Data flow analysis provides information for analyzers to detect this problem.
The PVS-Studio analyzer warning:
V3080. Possible null dereference. Consider inspecting 'potentiallyNullStr'.
The example of data flow analysis with bool
:
void ExampleOfAlwaysFalseExpression(bool flag)
{
if (flag)
{
Console.WriteLine("Positive");
if (!flag)
{
Console.WriteLine("Negative");
}
}
}
Given the if
statement, data flow analysis determines that the condition is true
in the then
block. Thus, in the then
block from the if(flag)
statement, the variable flag can only be true
. Using this information, the static analyzer finds out that the !flag
expression inside the then
block is always false.
The PVS-Studio warning for this code:
V3022. Expression '!flag' is always false.
The above examples are simple use cases of data flow analysis. Combining this approach with other techniques (such as intermodular analysis), enables the static analyzer to track complex chains of data changes and use them to find potential vulnerabilities and errors.
Additional links
1. Andrey Karpov, Paul Eremeev. PVS-Studio: static code analysis technology
2. Andrey Karpov. How static analysis works
3. Nikita Lipilin. PVS-Studio evolution: data flow analysis for related variables
Français
4