Nikita Lipilin

Apr 28 2022

Tags:

#CSharp #StaticAnalysis

PVS-Studio evolution: data flow analysis for related variables

Apr 28 2022

Author: Nikita Lipilin

What is this article about?
A little about data flow analysis
- Integer and boolean types
- Null-state analysis
Related variables
You're facing the wrong way!
PVS-Studio and related variables
Conclusion

Related variables are one of the main problems of static analysis. This article covers this topic and describes how PVS-Studio developers are fighting false positives caused by different relationships between variables.

0942_PVS_evolution_Data-Flow_analysis_related_variables/image1.png

What is this article about?

The PVS-Studio development team is striving to improve analysis quality and our analyzer. Recently we added an enhancement and can't wait to tell you about it! So, today we're talking about relationships between variables, how they can confuse a static analyzer and how the PVS-Studio C# analyzer tries to deal with them. Enjoy reading!

A little about data flow analysis

Let's start from one of the most important mechanisms in the PVS-Studio C# analyzer — data flow analysis. In short, data flow analysis is a technology that allows the analyzer to track possible values of variables. In PVS-Studio, data flow analysis closely interacts with other technologies.

Integer and boolean types

Let's consider data flow analysis features by the example of integer and boolean variables:

int a = 5;
int b = 3;
bool flag = a > b;

if (flag) // always true
{
  ....
}

Data flow analysis allows PVS-Studio to calculate the exact value of flag and report that the check is useless, because a is always greater than b.

In many cases expressions and variables can have any value from the set. For example:

void MyFunc(bool flag)
{
  int a = flag ? 1 : 10;
  bool greater = a > 5;

  if (greater)
    Console.WriteLine("a > 5");

  if (a == 5) 
    Console.WriteLine("a = 5");
}

The a variable can be equal to 1 or 10 — it depends on the value passed to the flag parameter. Therefore, the greater variable can be either true or false. So, the analyzer will not consider the greater value check useless.

On the other hand, PVS-Studio knows for sure that a is never equal to 5. That's why the analyzer will issue a warning:

V3022 Expression 'a == 5' is always false.

In some cases, "extra" checks appear due to typos or logical errors. For example, if a programmer checks a value of a wrong variable.

Null-state analysis

The mechanism of working with reference type variables is different. The analyzer monitors whether a variable is null — that is, it performs null-state analysis. PVS-Studio considers that each reference type variable can be in one of 4 states:

Unknown — if there's no information about whether a variable can be null or not. This is the default state for all reference type variables;
Null — if a variable is definitely null;
NotNull — if a variable is definitely not null;
PotentialNull — if in some cases a variable is definitely null.

Here's an example:

void TestReferences(bool flag)
{
  string potentialNullStr = flag ? "not null" : null;
    
  _ = potentialNullStr.GetHashCode();
}

When GetHashCode is called, the potentialNullStr variable can or cannot be null. If a reference that potentially has a null value is dereferenced, this can cause an exception — so, the analyzer generates an appropriate warning:

V3080 Possible null dereference. Consider inspecting 'potentialNullStr'.

What must be done? The easiest thing is to check that the variable is not null:

void TestReferences(bool flag)
{
  string potentialNullStr = flag ? "not null" : null;
    
  if (potentialNullStr != null)
    _ = potentialNullStr.GetHashCode();
}

The analyzer can easily find out that the potentialNullStr variable in the body of the if statement, is definitely not null. This means that the GetHashCode call will not cause an exception.

Related variables

Sometimes developers use more sophisticated methods to perform null checks in real code. We're not talking about null-conditional operator — it's not so difficult to support this operator. In the simplest case, it's enough not to issue a warning, if "?." is used to access a member. What is really difficult for the analyzer is to handle the check for null with the help of a related variable.

To better understand the topic, let's get back to the example given earlier:

public void TestReferences(bool flag)
{
  string potentialNull = flag ? "not null" : null;

  if (potentialNull != null)
    _ = potentialNull.GetHashCode();
}

The potentialNull variable can contain null. However, there's a check before the dereferencing, and data flow analysis takes this into account. But what if the check for null is done implicitly?

public void TestReferences(bool flag)
{
  string potentialNull = flag ? "not null" : null;

  if (flag)
    _ = potentialNull.GetHashCode();
}

The static analyzer considers the flag value as unknown. This means that potentialNull can contain null. Further check does not give any information about potentialNull, because this variable is not even used in the condition. Thus, the analyzer will warn us that the null reference can be potentially dereferenced.

In fact, if flag = true, then potentialNull contains a string. There's no check for null, but null can't be dereferenced here.

Relationships between variables can be built in many ways. Earlier we considered an example with variables of logical and reference types. However, any variable can depend on any variable. For example, here's the relationship between two reference type variables:

public void RelatedVariables2(string param)
{
  string? potentialNull = param != null ? "not null" : null;

  if (param != null)
  {
    _ = potentialNull.GetHashCode();
  }
}

The potentialNull variable is null only if param is null. In other words, either both variables are null, or both variables are not null. So, the GetHashCode call here will never cause an exception.

Well, enough for reference type variables. Let's consider another example:

public void RelatedVariables3(int a, int[] array)
{
  int b = 0;
  int index = -1;

  if (a == 0)
  {
    b = 10;
    index = 1;
  }

  if (b > 0)
  {
    _ = array[index];
  }
}

Take a look at this code and think — can there be an attempt to access an element with index -1?

0942_PVS_evolution_Data-Flow_analysis_related_variables/image2.png

Even a developer can get confused by such an example. The index variable cannot be equal to -1 if b > 0. b > 0 only if a = 0, and if a = 0, then index = 1. Hope you are not confused :).

The given examples are synthetic. This rarely appears in real code. Nevertheless, our users sometimes inform us about false positive caused by related variables. For example, recently a user notified us about an issue with the code of the following type:

public void Test()
{
  var a = GetPotentialNull();
  bool z = a != null;

  if (z)
  {
    _ = a.GetHashCode(); // <=
  }
}

Alas, the analyzer used to shamelessly lie about potential null reference dereferencing!

0942_PVS_evolution_Data-Flow_analysis_related_variables/image3.png

But this is not a disaster. False positives are unavoidable, but the analyzer provides various opportunities to deal with them. The easiest thing to do is to mark the warning as false so that it does not irritate a developer. You can read more about this here.

Nevertheless, PVS-Studio and false positives have an endless fight. We are trying to reduce the number of them so that our users don't waste their time investigating false positives. By the way, the following article covers this topic in detail: "The way static analyzers fight against false positives, and why they do it". Do take a look if you haven't already :).

You're facing the wrong way!

You may think that I shouldn't have told you all that. Strange that I'm talking about static analysis disadvantages! Looks like I'm playing for the wrong team :).

0942_PVS_evolution_Data-Flow_analysis_related_variables/image4.png

But that's not true. Such articles are primarily devoted to the analyzer development and enhancement that we added to make our product better. Any development starts with identifying the problem. Does the analyzer have flaws? Yes. Sometimes the analyzer doesn't issue a warning where it must be, and sometimes it issues false positives. These things happen. But we always try to solve such issues. Our clients write us about their issues — we do everything to make PVS-Studio better.

And such articles help us to tell the world about our achievements :). Speaking of which...

PVS-Studio and related variables

The variety of possible relationships between variables is fascinating, and it's not an easy task to support them. However, to deal with false positives we decided to gradually cover the most common relationships between variables.

Before we begin, let me tell you some facts.

Many code fragments in this article are synthetic. You can find them strange and wonder: "who would write something like that?" — believe me, all the examples are based on real code. The examples are elementary, but at the same time they help to reproduce the analyzer behavior.

As PVS-Studio developers, we want to thank our users for telling us about their issues (including false positives). And we are even more pleased when our users send us clear-cut code examples that we use to easily reproduce incorrect behavior. This incredibly speeds up the process of making necessary fixes :).

Heuristic algorithm

As for the first solution to deal with false positives, we've chosen a special algorithm. It helped to get rid of many false positives. It partially eliminates warnings caused by implicit relationships between different values and reference types variables.

Investigating false positives, we noticed an interesting pattern. If the dereferencing happens in the body of a conditional construction, null-state of the corresponding variable most likely relates to the expression in the condition. In other words, the analyzer considered the dereferencing performed under a condition as safe, because the corresponding reference was implicitly checked with the help of a related variable.

Take a look at an example:

void Test(bool condition)
{
  object a;
  if (condition)
    a = new object();
  else
    a = null;

  ....

  if (condition)
    _ = a.ToString();
}

Since the a variable is dereferenced in the body of the conditional construction, the analyzer seems to assume that there's a connection between a and the condition. Due to this, PVS-Studio will not issue a warning. In this case, the warning issued for the ToString call would indeed be false, because if condition = true, then a is not null.

In such form, the algorithm cut off lots of good warnings, so we started to think how to enhance the algorithm. We achieved the best results by adding an additional exception condition: the null must be set in the same method where the dereferencing happens. Usually in such cases null-state relates to a condition.

Here is an example of null obtained from another method:

bool _flag;

object GetPotentialNull() => _flag ? "not null" : null;

void Test(bool condition)
{
  object a = GetPotentialNull();

  if (condition)
    _ = a.ToString();
}

The a variable is indeed dereferenced under a condition, but there is no relationships between it and condition. Such heuristic allowed to "save" many good warnings, although it added a few false ones.

For a long time, this algorithm has been the main method to deal with related variables. It already helps to remove a significant part of false positives on the code of real projects. And yet the results of such an exception are not perfect: sometimes the analyzer cuts off good warnings, and sometimes it "skips" false ones. The loss of some good warnings is not such a critical problem, but we still must do something with false warnings.

Not such a meaningless assignment

Usually, our clients do not ask us to "support related variables". This even sounds very abstract! It's not so important for our users to know how the analyzer works from the inside — they just need a high-quality output by PVS-Studio. That's why our clients inform us of specific false positives issued by the analyzer. And we are trying to find out what the problem is, and how to solve it.

Once a user reported of a warning issued for a code fragment of the following type:

static void Foo()
{
  Holder h = new Holder();
  Parameter p = h.GetParam();

  p.Text = "ABC"; // <=
  h.f();
  p.Text = "XYZ"; // <=
  h.f();
}

V3008 The 'p.Text' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 35, 33.

The warning says that the first assignment is useless — the "ABC" value is never used. Something is wrong here; the code must be investigated and fixed...

Actually, no! The assignment is certainly not useless. But why? The first thought that may arise is to take a look at the Text property. Perhaps the assignment to this property affects something? Nothing of the sort:

class Parameter
{
  internal string Text { get; set; }
}

A usual automatic property. When the value is assigned to this property nothing unusual happens. So, there's no point in assigning a value twice... This seems a bit strange. However, the warning is still false.

To finally understand what's happening here, take a look at the Holder class:

class Holder
{
  private Parameter param;
  internal Parameter GetParam() 
  {
    return param;
  }
  
  internal Holder() 
  {
    param = new Parameter();
    param.Text = "";
  }
  
  internal void f()
  {
    Console.WriteLine("Holder: {0}", param.Text);
  }
}

It turns out that the f method uses the value of the param.Text property. Now, let's get back to the original example:

static void Foo()
{
  Holder h = new Holder();
  Parameter p = h.GetParam();

  p.Text = "ABC";
  h.f();
  p.Text = "XYZ";
  h.f();
}

In fact, a reference to the param field of the h object is written to the p variable. When the f method is called, this field is used — more precisely, its Text property is used. When f is called first time, "ABC" is written in Text. When f is called second time, "XYZ" is written. Thus, each assignment has played its role, and there is no error here.

In this case, quite an unusual relationship between the p.Text property and the h variable caused a false positive. The h.f() call uses the value written in p.Text. The diagnostic has to take this into account.

To solve this issue, we decided to adjust one of diagnostic's exceptions. The exception suggests that an object is used between two assignments. Thus, the analyzer does not issue a warning. For example:

void Test()
{
  int a, x;
  a = 10;
  x = a; // a is used
  a = 20;
}

Such code does not trigger the analyzer because the a variable is used between assignments. Unlike the previous case, the a variable is used explicitly, so it is easy to exclude the warning here. But what to do when the assigned value is used implicitly while the method is called? Let's figure it out.

static void Foo()
{
  Holder h = new Holder();
  Parameter p = h.GetParam();

  p.Text = "ABC";
  h.f();        // p.Text is used here
  p.Text = "XYZ";
  h.f();        // and here
}

To solve this issue, we decided to adjust the V3008 rule. Now, when the diagnostic checks the code, it saves the pairs of potentially related variables. If one of them is used, the analyzer considers the other as used too. The analyzer considers p to be potentially related to h because its value is obtained when h.GetParam() is called. At the same time, the h.f() call suggests that not only h is used. p related to h is also potentially used, as well as p's properties. Thus, the analyzer does not issue a warning for an "extra assignment" of p.Text.

A real example of relationships between variables

Synthetics is good but that's not interesting. Well, great that the analyzer works better now. But we discussed only synthetic examples. What's the point if no one writes code that demonstrates the enhancement? By the way, there's quite a striking note describing the evaluation of analyzers' work with synthetic examples. It's about C++, but the general idea is the same.

And we're talking about a completely different case. Firstly, we made an enhancement at the client's request. We helped them to get rid of false positives in the project. Secondly, the analyzer's enhancements are useful for other real projects. For example, take a look at the code from RavenDB — we use it to test PVS-Studio:

[Fact]
public void CRUD_Operations_With_Array_In_Object_2()
{
  ....
  var family = new Family()
  {
    Names = new[] { "Hibernating Rhinos", "RavenDB" }
  };
  newSession.Store(family, "family/1");
  newSession.SaveChanges();

  var newFamily = newSession.Load<Family>("family/1");

  newFamily.Names = new[] {"Hibernating Rhinos", "RavenDB"};   // <=
  Assert.Equal(newSession.Advanced.WhatChanged().Count, 0);

  newFamily.Names = new[] { "RavenDB", "Hibernating Rhinos" }; // <=
  Assert.Equal(newSession.Advanced.WhatChanged().Count, 1);

  newSession.SaveChanges();
  ....
}

V3008 The 'newFamily.Names' variable is assigned values twice successively. Perhaps this is a mistake.

So, the analyzer reported that a value is assigned twice in newFamily.Names; the first value is never used. And the code indeed suggests that the first value is never used explicitly. But let's take a better look.

An object of the Family class is saved to the session. At this point it contains "Hibernating Rhinos" and "RavenDB" names. Then, the same object (or at least an object containing same values) is loaded from the session. After that, the same names are written into it. And then the call happens:

Assert.Equal(newSession.Advanced.WhatChanged().Count, 0);

Obviously, this check takes into account the previously recorded value. This test checks that there are no changes — after all, the names are the same. A little lower in the code the names are swapped, and a similar check happens. The changes expected there. The connection between newSession.Advanced.WhatChanged() calls and newFamily.Names is obvious.

It turns out that here the analyzer must not issue a warning about "useless" assignment. And you know what? Now PVS-Studio doesn't do this :). And developers don't waste time investigating unnecessary warnings.

By the way, we noticed that some other false positives have disappeared. However, they're similar to the examples we've discussed earlier. So, let's move on to the next section.

The as operator to convert the results

While we were rejoicing in our victory over false positives informing about "unnecessary" assignments, another client sent us a new example:

void Test(object obj)
{
  if (obj != null)
    Console.WriteLine("obj is not null");

  string str = obj as string;

  if (str != null)
    Console.WriteLine(obj.GetHashCode()); // <=
}

V3125 The 'obj' object was used after it was verified against null.

Well, let's dig into this.

At first, the obj parameter is checked for null equality. The method assumes that obj can receive a null reference. Then, with the help of the as operator, obj is converted to the String type. The result is written to the str variable.

And the most interesting part comes next. If str is not null, the GetHashCode method is accessed. However, GetHashCode is called for the obj object, not for str! It turns out that the wrong variable is checked. Even if str is not null, obj can still potentially contain a null value.

At least it may seem so. In fact, if str!= null, then obj != null. Why?

Let's say obj is indeed null. Then the first check gives false — well, that's fine. After that, a value for str is calculated. Since the obj variable is null, str is definitely null. So, we can conclude the following: if str does not contain null, then obj does not contain null too.

It's cool that we figured this out, but we should also explain this to the analyzer. Data flow analysis used in PVS-Studio helps us with this. PVS-Studio creates special objects for suitable expressions from the analyzed code. These objects store information about possible values. We call such objects virtual values. They also contain auxiliary data widely used by diagnostics. For example, data flow tracks whether the variable's value is:

the result of the FirstOrDefault call;
potentially tainted (click here for more information);
the result of converting with the help of the as operator;
etc.

To understand how the analyzer began to take into account conversions through the as operator, let's get back to the example:

void Test(object obj)
{
  if (obj != null)
    Console.WriteLine("obj is not null");

  string str = obj as string;

  if (str != null)
    Console.WriteLine(obj.GetHashCode());
}

The str variable receives the result of the obj casting with the help of the as operator. Data flow writes this information to the corresponding virtual value. We've already implemented this functionality in the analyzer, and some analyzer's rules widely use it. V3149 is one of such rules.

When str != null is processing, the analyzer calculates the following: if this expression is true, str is definitely not null. At the same time, the analyzer already knows that the str value is obtained from casting obj with the help of the as operator. It turns out that the analyzer can quite rightly consider the obj value as not equal to null.

Real examples of conversions with the help of the as operator

To be honest, we didn't even expect such a result, but a whole bunch of false positives just disappeared. Who would have thought that such a check for null with the help of the as operator is so common?

Issue 1

As a first example, consider a code fragment from the SpaceEngineers project:

void Toolbar_ItemChanged(MyToolbar self, MyToolbar.IndexArgs index)
{
  Debug.Assert(self == Toolbar);
    
  var tItem = ToolbarItem.FromItem(self.GetItemAtIndex(index.ItemIndex));
  ....
}

V3080 Possible null dereference of method return value when it is passed to method as its 1st argument.

So, the warning stated that the ToolbalItem.FromItem method can receive null — and this will result in throwing an exception. Is that true?

First, we should take a look at the GetItemAtIndex method:

public MyToolbarItem GetItemAtIndex(int index)
{
  if (!IsValidIndex(index)) 
    return null;

  return this[index];
}

Data flow analysis helped the analyzer to find out that in some cases this method returns null. But will it cause any problems? Now let's move on to the definition of the FromItem method:

public static ToolbarItem FromItem(MyToolbarItem item)
{
  var tItem = new ToolbarItem();
  tItem.EntityID = 0;
  var terminalItem = item as MyToolbarItemTerminalBlock;
  if (terminalItem != null)
  {
    var block = item.GetObjectBuilder() as ....; // <=
    ....
  }
  ....
  return tItem;
}

Earlier we found out that the item parameter can contain null. Here, the dereference happens, but before that item is not checked. However, terminalItem is checked! And if terminalItem is not null, then item is definitely not null.

Issue 2

We found a similar example in the SharpDevelop project:

DocumentScript GetScript(string fileName)
{
  ....
  var formattingOptions
       = CSharpFormattingPolicies.Instance
                                 .GetProjectOptions(compilation.GetProject());
  ....
}

V3080 Possible null dereference of 'compilation.GetProject()' method return value at 'project.FileName' when it is passed to method as its 1st argument.

So, the analyzer warned about the possible null reference dereferencing inside the GetProjectOptions method. The reason for this is passing compilation.GetProject() as the first argument. Let's figure it out.

Interprocedural analysis helped us to find out that GetProject does sometimes return null. But what about GetProjectOptions? Let's take a look:

public CSharpFormattingPolicy GetProjectOptions(IProject project)
{
  if (!initialized)
    return GlobalOptions;

  var csproject = project as CSharpProject;
  if (csproject != null) {
    string key = project.FileName;            // <=
    ....
  }

  return SolutionOptions ?? GlobalOptions;
}

Indeed, the first argument's property is accessed here. However, only if it is not null! Here the result of converting using the as operator is checked, not the project.

Issue 3

We also got rid of another false positive issued for the ILSpy project's code:

protected override Expression DoResolve (ResolveContext ec)
{
  var res = expr.Resolve(ec);
  var constant = res as Constant;

  if (constant != null && constant.IsLiteral)
  {
    return Constant.CreateConstantFromValue(res.Type,           // <=
                                            constant.GetValue(),
                                            expr.Location);
  }

  return res;
}

V3080 Possible null dereference. Consider inspecting 'res'.

res gets its value from the expr.Resolve(ec) call. In some cases, it does return null. But when the Type property is accessed, the variable is definitely no longer null. As in previous cases, the check is performed implicitly. If constant != null, then res != null too.

The support of the as operator helped to get rid of many other false positives. But all of them are similar to those that we've already discussed. If you want to see for yourself how PVS-Studio analyzes such cases, follow the link to download the analyzer. And have fun!

Typical related variables

Earlier, we discussed types of relationships between variables that we don't meet that often. Tests proved that enhancements produced a tangible result. However, much more often we encountered relationships between logical and reference type variables.

Earlier, we discussed an example demonstrating such relationships:

public void Test()
{
  var a = GetPotentialNull();
  bool flag = a != null;

  if (flag)
  {
    _ = a.GetHashCode(); // <=
  }
}

V3080 Possible null dereference. Consider inspecting 'a'.

If flag = true, then the a variable cannot be null. Thus, the implicit check protects the code against problems.

To teach the analyzer to take into account such connections, we decided to enhance our data flow analysis again. However, this case was a bit more complicated.

Unlike the case with the as operator, here we needed to add a new type of information about the variable. In particular, data about the relationship with another variable. Processing the flag declaration, the analyzer calculates the possible values of variables in the following cases:

if the expression (and therefore flag) is true;
if the expression is false.

After processing the flag declaration, the analyzer added 2 rules to the corresponding virtual value:

if flag == true, then a != null;
if flag == false, then a == null.

Now flag has the necessary data. The only thing to do is to use this information when processing the if (flag) condition. Here data flow calculates the possible values of variables in the then-branch. So, flag is always true, and a related to this variable is definitely not null.

We've been suffering from such false positives for quite a long time. Finally, we decided to deal with them. And it seems that we succeeded :). Now the analyzer tracks this kind of relationships and takes them into account during code analysis.

The analyzer greatly deals with synthetic examples, but let's see how it works with real projects.

Typical relationships between variables in real code

Here the result is even better than with the as operator. Curiously, the enhancement allowed not only to get rid of false positives but also to add a few "true" ones.

Issue 1

To begin with, let's consider a fairly simple false positive found in the BouncyCastle project.

public static Stream ReplaceSigners(....)
{
  ....

  CmsTypedStream signedContent = parser.GetSignedContent();
  bool encapsulate = (signedContent != null);
  Stream contentOut = gen.Open(outStr,
                               parser.SignedContentType.Id,
                               encapsulate);
  if (encapsulate)
  {
    Streams.PipeAll(signedContent.ContentStream, contentOut); // <=
  }

  ....
}

V3080 Possible null dereference. Consider inspecting 'signedContent'.

The former false positive indicated a possible null reference dereferencing. If signedContent is null, then ContentStream access will cause an exception.

However, pay attention to the encapsulate value check. It implicitly prevents null reference dereferencing, because encapsulate = true only when signedContent != null. Our recent enhancements taught PVS-Studio to take into account such relationships — so, the false positive disappeared.

Issue 2

The following example is taken from the ccnet project:

public bool Authenticate(LoginRequest credentials)
{
  // Check that both the user name and the password match
  string userName = GetUserName(credentials);
  string password = NameValuePair.FindNamedValue(....);
  
  bool isValid =    !string.IsNullOrEmpty(userName)
                 && !string.IsNullOrEmpty(password);

  if (isValid)
  {
    isValid =    SecurityHelpers.IsWildCardMatch(userName,     // <=
                                                 this.userName)
              && ....;
  }

  return isValid;
}

V3080 Possible null dereference inside method at 'wildCard.Replace'. Consider inspecting the 1st argument: userName.

This warning indicated that the IsWildCardMatch method potentially receives a null reference as the first argument. It also suggests that its dereferencing could occur inside. Thus, a NullReferenceException can be thrown. But is that really so?

The value of the first argument — userName — comes from the GetUserName call. And it indeed can pass null — that's exactly what the analyzer detected. The IsWildCardMatch method contains the dereferencing of the first argument:

public static bool IsWildCardMatch(string wildCard, string value)
{
  Regex wildCardRegex = new Regex(wildCard.Replace("*",
                                                   "[a-zA-Z0-9_.@-]*"),
                                  RegexOptions.IgnoreCase);

  return wildCardRegex.IsMatch(value);
}

But null is not passed there! You must have already noticed the isValid check here:

bool isValid =    !string.IsNullOrEmpty(userName)
               && !string.IsNullOrEmpty(password);

if (isValid)
{
  isValid =    SecurityHelpers.IsWildCardMatch(userName,
                                               this.userName)
            && ....;
}

If isValid = true, then userName cannot be null. Now the analyzer is aware of this thanks to the support of such relationships.

Issue 3

We also encountered another false positive issued for the FlashDevelop project's code:

public void HandleEvent(Object sender, NotifyEvent e, HandlingPriority priority)
{
  ....
  features = enabledLanguages.ContainsKey(ext) ? enabledLanguages[ext] : null;
  
  if (completion == null)
    completion = new Completion(config, settingObject);

  completion.OnFileChanged(features);                      // <=

  if (features != null && features.Syntax != null)
    ....
  ....
}

V3080 Possible null dereference inside method at 'features.Mode'. Consider inspecting the 1st argument: features.

The warning suggests that the OnFileChanged method receives the features variable containing a potential null value. This can lead to null reference dereferencing.

The code clearly shows that in some cases features receives null, and the corresponding condition is below. However, the variable is not checked before the OnFIleChanged method receives it — there is no even implicit check with the help of related variables.

So, why did this warning disappear after PVS-Studio supported the related variables? The OnFileChanged method can answer this question:

internal void OnFileChanged(CssFeatures features)
{
  if (features == this.features) return;
  this.features = features;
  enabled = features != null;               // <=

  if (enabled)
  {
    wordChars = lang.characterclass.Characters;
    if (features.Mode != "CSS") wordChars += features.Trigger;
    InitBlockLevel();
  }
}

And here are the related variables! Features are dereferenced only if enabled = true, and this is possible only if features != null. Thus, the warning was indeed false.

Issue 4

As I mentioned earlier, the analyzer's ability to track such relationships helps to get rid of false positives and also to generate correct warnings.

For example, consider the following code fragment from Roslyn:

public override object GetFunctionExtender(string name,
                                           SyntaxNode node,
                                           ISymbol symbol)
{
  ....
  
  var methodSymbol = (IMethodSymbol)symbol;
  isDeclaration = methodSymbol.PartialDefinitionPart == null;
  hasOtherPart = isDeclaration
                    ? methodSymbol.PartialImplementationPart != null
                    : methodSymbol.PartialDefinitionPart != null;    // <=
    
  ....
}

V3022 Expression 'methodSymbol.PartialDefinitionPart != null' is always true.

So, having learned how to track relationships of the appropriate type, PVS-Studio generated a warning about logical expression that always returns true. Why did the analyzer decide so?

As in previous cases, the logic here is simple. isDeclaration will be true only if MethodSymbol.PartialDefinitionPart is null. On the other hand, if isDeclaration is false, then methodSymbol.PartialDefinitionPart is definitely not null.

Thus, the last expression of the ternary operator will always have the true value. In some cases always-true expressions are harmless redundant code, but in other cases they denote errors. Sometimes developers write such code to improve readability. It's difficult to say what case we have here.

If there's no error here, then the code could be simplified:

hasOtherPart =    !isDeclaration
               || methodSymbol.PartialImplementationPart != null;

On the other hand, this is just my opinion, and someone can find the original code more readable.

Conclusion

Variables can be related in a huge number of ways — it's quite problematic to support them all. I'm not sure if it's even possible. Such relationships are not so common, but sometimes they lead to false positives. PVS-Studio developers are constantly working on improving the analyzer. And we're also interested in supporting related variables. Of course, our client's wishes are the most important thing for us. Nevertheless, we welcome any feedback. Therefore, why don't you, my dear readers, try the static analyzer on your projects for free. I'm sure you won't be disappointed :).

What cases of related variables have you encountered? Share your experience in the comments — let's see how many cases we can collect.

See you soon!

#CSharp #StaticAnalysis

Tags:

#CSharp #StaticAnalysis

Fill out the form in 2 simple steps below:

Your contact information:

Desired license type:

PVS-Studio evolution: data flow analysis for related variables

What is this article about?

A little about data flow analysis

Integer and boolean types

Null-state analysis

Related variables

You're facing the wrong way!

PVS-Studio and related variables

Heuristic algorithm

Not such a meaningless assignment

A real example of relationships between variables

The as operator to convert the results

Real examples of conversions with the help of the as operator

Typical related variables

Typical relationships between variables in real code

Conclusion

Comments (0)

Want to try PVS‑Studio for free?

Achievements

PVS-Studio

Licensing

Company