>
>
>
Microsoft makes mistakes too. Let's che…

Nikita Priadko
Articles: 2

Microsoft makes mistakes too. Let's check MSBuild

MSBuild is an automated application building platform by Microsoft that is used to compile, package, and deploy applications. The project is very popular among developers, we also actively use it when working on our analyzer. In this article, we'll talk about code in Microsoft's product that can potentially cause errors.

Introduction

This is the third time we've checked the MSBuild project, the first in 2016 and the second in 2022.

The project code state at the time of the analysis matches this commit.

Over the time, Microsoft has been actively developing its product, providing a lot of new material for the analysis. We re-checked it not only due to the growing codebase, but also because our analyzer is actively evolving. For example, the PVS-Studio team has added over 40 diagnostic rules for C# projects in less than 2.5 years. We've also refined many existing diagnostic rules and analysis mechanisms.

Let's take a look at the most interesting warnings issued by the analyzer.

Issue 1: incorrect format string

private int QueryConsoleBufferWidth()
{
  int consoleBufferWidth = -1;
  try
  {
    consoleBufferWidth = Console.BufferWidth;
  }
  catch (Exception ex)
  {
    CommunicationsUtilities.Trace("MSBuild client warning:" +
            "problem during querying console buffer width.", ex);
  }

  return consoleBufferWidth;
}

The PVS-Studio message: V3025. The 1st argument '"MSBuild client warning: problem during querying console buffer width."' is used as incorrect format string inside method. A different number of format items is expected while calling 'Trace' function. Arguments not used: 1st. MSBuildClient.cs 394

The analyzer warns about an incorrect format string passed to the Trace method.

For better clarity, let's look at the Trace method:

internal static void Trace<T>(string format, T arg0)
{
    Trace(nodeId: -1, format, arg0);
}

internal static void Trace<T>(int nodeId, string format, T arg0)
{
    if (s_trace)
    {
        TraceCore(nodeId, string.Format(format, arg0));
    }
}

A string is passed to the Trace method and used as the format string for string.Format(format, arg0). This string must contain placeholders for arg0. However, in the above code, the format string passed to Trace lacks them, causing the loss of an exception message.

Issue 2: an incorrect unsubscription from an event

private static void SubscribeImmutablePathsInitialized()
{
  NotifyOnScopingReadiness?.Invoke();

  FileClassifier.Shared.OnImmutablePathsInitialized -= () =>
    NotifyOnScopingReadiness?.Invoke();
}

The PVS-Studio message: V3084. Anonymous function is used to unsubscribe from 'OnImmutablePathsInitialized' event. No handlers will be unsubscribed, as a separate delegate instance is created for each anonymous function declaration. CheckScopeClassifier.cs 67

In this case, unsubscribing from a delegate doesn't take effect because each time an anonymous function is declared, a new delegate instance is created. As a result, an attempt to unsubscribe from this instance won't take the expected effect. OnImmutablePathsInitialized is subscribed to delegate 1 but unsubscribes from delegate 2, which has no effect.

Issue 3: WriteLine instead of Write

public override void WriteLine(string format, params object?[] arg)
{
  lock (_lock)
  {
    _internalWriter.WriteLine(format, arg);                          // <=
  }
}

public override void Write(string format, object? arg0)
{
  lock (_lock)
  {
    _internalWriter.Write(format, arg0);
  }
}

public override void Write(string format, object? arg0, object? arg1)
{
  lock (_lock)
  {
    _internalWriter.Write(format, arg0, arg1);
  }
}

public override void Write(string format, object? arg0,
                            object? arg1, object? arg2)
{
  lock (_lock)
  {
    _internalWriter.Write(format, arg0, arg1, arg2);
  }
}

public override void Write(string format, params object?[] arg)
{
  lock (_lock)
  {
    _internalWriter.WriteLine(format, arg);                          // <=
  }
}

The PVS-Studio message: V3013. It is odd that the body of 'Write' function is fully equivalent to the body of 'WriteLine' function (594, line 738). OutOfProcServerNode.cs 594

The analyzer flagged it as unusual that the WriteLine and Write methods have identical bodies.

Looking at other Write overloads, we'll see that _internalWriter.Write is used there, while _internalWriter.WriteLine is used for void Write(string format, params object?[] arg). This is most likely either a typo or the result of a copy-paste error.

Issue 4: an empty check for null

private void ProcessParameters()
{
  if (Parameters == null)
  {
  }

  string[] parameters = Parameters.Split(s_semicolonChar);

  if (parameters.Length != 1)
  {
    ....
  }

  _logFile = parameters[0];
}

The PVS-Studio message: V3125. The 'Parameters' object was used after it was verified against null. Check lines: 91, 87. TaskUsageLogger.cs 91

The analyzer has detected that the Parameters property is used after checking for null.

Under mysterious circumstances, the body of the if statement ended up empty. This makes further use of Parameters unsafe. Moreover, searching for similar code in another file revealed a method with the same name and a similar body.

Similar code in the XmlFileLogger.cs file:

private void ProcessParameters()
{
  ....
  if (Parameters == null)
  {
    throw new LoggerException(invalidParamSpecificationMessage);
  }

  string[] parameters = Parameters.Split(';');

  if (parameters.Length != 1)
  {
    throw new LoggerException(invalidParamSpecificationMessage);
  }

  _logFile = parameters[0];
}

As we can see, Parameters is also checked for null here, but then an exception is thrown. The check in the first code block should probably be implemented the same way.

Issue 5: a suspicious check for null

private async Task<WorkUnitResult> InitializeAndExecuteTask(....)
{
  if (!_taskExecutionHost.InitializeForBatch(....))
  {
    ProjectErrorUtilities.ThrowInvalidProject(_targetChildInstance.Location,
                                              "TaskDeclarationOrUsageError",
                                              _taskNode.Name);
  }

  using var assemblyLoadsTracker = 
    AssemblyLoadsTracker.StartTracking(taskLoggingContext,
                                   AssemblyLoadingContext.TaskRun,
                                   _taskExecutionHost?.TaskInstance?.GetType());
}

The PVS-Studio message: V3095. The '_taskExecutionHost' object was used before it was verified against null. Check lines: 650, 655. TaskBuilder.cs 650

After calling the _taskExecutionHost.InitializeForBatch method, the if statement is checked for null as follows: _taskExecutionHost?.TaskInstance?.GetType. However, if the _taskExecutionHost value is null, this check won't prevent an exception because the field has already been used. To ensure safety, one should also add this check for the expression in the if statement.

Issue 6: an operator precedence error

public bool Equals(TaskItem other)
{
  ....
  int capacity = _itemDefinitions?.Count ?? 0 + _directMetadata?.Count ?? 0;
  var thisNames = new HashSet<string>(capacity,     
                                      MSBuildNameIgnoreCaseComparer.Default);
  ....
}

The PVS-Studio message: V3123. Perhaps the '??' operator works in a different way than it was expected. Its priority is lower than priority of other operators in its left part. ProjectItemInstance.cs 1628

The expression on the right side of the capacity assignment may not work the way the developer expects because of the ?? operator precedence.

The _itemDefinitions?.Count and _directMetadata?.Count values are never summed because, without parentheses, the operator precedence is as follows:

int capacity = _itemDefinitions?.Count ??((0 + _directMetadata?.Count)?? 0);

The developers may have intended for null to be replaced with 0 during summation but overlooked operator precedence.

In this case, the correct code should look like this:

int capacity = (_itemDefinitions?.Count ?? 0) + (_directMetadata?.Count ?? 0);

Issue 7: assignment duplication

Spotting errors in code like the example below is a challenge for a human, but a static analyzer can do it effortlessly.

internal ProjectInstance(string projectFile, 
                         ProjectInstance projectToInheritFrom,
                         IDictionary<string, string> globalProperties)
{
  _projectFileLocation = ElementLocation.Create(projectFile);
  _globalProperties = new PropertyDictionary<ProjectPropertyInstance>(
    globalProperties.Count);
  this.Toolset = projectToInheritFrom.Toolset;
  this.SubToolsetVersion = projectToInheritFrom.SubToolsetVersion;
  _explicitToolsVersionSpecified =
    projectToInheritFrom._explicitToolsVersionSpecified;
  _properties = new PropertyDictionary<ProjectPropertyInstance>(
    projectToInheritFrom._properties);
  _items = new ItemDictionary<ProjectItemInstance>();
  _actualTargets = new RetrievableEntryHashSet<ProjectTargetInstance>(
    StringComparer.OrdinalIgnoreCase);
  _targets = new ObjectModel.ReadOnlyDictionary<....>(_actualTargets);
  _environmentVariableProperties =
    projectToInheritFrom._environmentVariableProperties;
  _itemDefinitions = ....;
  _hostServices = projectToInheritFrom._hostServices;
  this.ProjectRootElementCache =
    projectToInheritFrom.ProjectRootElementCache;
  _explicitToolsVersionSpecified = 
    projectToInheritFrom._explicitToolsVersionSpecified;
  this.InitialTargets = new List<string>();
  this.DefaultTargets = new List<string>();
  this.DefaultTargets.Add("Build");
  this.TaskRegistry = projectToInheritFrom.TaskRegistry;
  _isImmutable = projectToInheritFrom._isImmutable;
  _importPaths = projectToInheritFrom._importPaths;
  ImportPaths = new ObjectModel.ReadOnlyCollection<string>(_importPaths);
  _importPathsIncludingDuplicates = 
    projectToInheritFrom._importPathsIncludingDuplicates;
  ImportPathsIncludingDuplicates = new
    ObjectModel.ReadOnlyCollection<string>(_importPathsIncludingDuplicates);
  this.EvaluatedItemElements = new List<ProjectItemElement>();
  IEvaluatorData<....> thisAsIEvaluatorData = this;
  thisAsIEvaluatorData.AfterTargets = new 
    Dictionary <string,List<TargetSpecification>>();
  thisAsIEvaluatorData.BeforeTargets = new 
    Dictionary<string, List<TargetSpecification>>();
  foreach (KeyValuePair<string, string> property in globalProperties)
  {
    _globalProperties[property.Key] = 
    ProjectPropertyInstance.Create(property.Key, property.Value,
                                   false /* may not be reserved */,
                                   _isImmutable);
  }
}

Let's get rid of the unnecessary stuff so it's easier to see:

internal ProjectInstance(....)
{
  ....
  _explicitToolsVersionSpecified = 
    projectToInheritFrom._explicitToolsVersionSpecified;            // <=
  _properties = ....
  _items = ....
  _actualTargets = ....
  _targets = ....
  _environmentVariableProperties = ....
  _itemDefinitions = ....
  _hostServices = ....
  _explicitToolsVersionSpecified = 
    projectToInheritFrom._explicitToolsVersionSpecified;             // <=
  ....
}

The PVS-Studio message: V3008. The '_explicitToolsVersionSpecified' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 547, 538. ProjectInstance.cs 547

In this code snippet, assignment is performed for multiple fields and properties. However, there's a duplicate among them. Sometimes devs mistakenly use one variable instead of another, which can cause issues.

Developers should pay special attention to such code fragments, because if they make a mistake, finding a missing or redundant code line among the others will be difficult.

Issue 8: using an array before a check for null

private async Task<BuildEngineResult> BuildProjectFilesInParallelAsync(....,
                                                 string[] toolsVersion, ....)
{
  if(.... && toolsVersion[0] == null)                               // <=
  {
    ....
  }
  else
  {
    ....    
  }

  IRequestBuilderCallback builderCallback = _requestEntry.Builder as 
                                             IRequestBuilderCallback;
  BuildResult[] results = await builderCallback.BuildProjects(
          projectFileNames,
          propertyDictionaries,
          toolsVersion ?? [],                                      // <=
          targetNames ?? [],
          waitForResults: true,
          skipNonexistentTargets: skipNonexistentTargets);
  ....
}

The PVS-Studio message: V3095. The 'toolsVersion' object was used before it was verified against null. Check lines: 1153, 1208. TaskHost.cs 1153

The toolsVersion array element is accessed in the code, and then ?? is applied to that array. This indicates a potential NullReferenceException error because toolsVersion is used before the null check.

Conclusion

Repeated project checks have uncovered numerous new errors, which highlights the importance of integrating static analysis into actively evolving projects. MSBuild is a Microsoft product, so they likely focus on detecting and fixing bugs. However, even then, we keep finding issues in their code.

I'd like to note that errors in code are an inevitable part of any project. Having an extensive codebase, MSBuild is a high-quality product with a low error rate.

We're constantly enhancing the PVS-Studio static analyzer. If you'd like to evaluate the capabilities of static analysis yourself, you're welcome to try PVS-Studio for free.

Thank you for reading :)