>
>
>
Azure SDK for .NET: story about a diffi…

Sergey Khrenov
Articles: 39

Azure SDK for .NET: story about a difficult error search

When we decided to search for errors in the Azure SDK for .NET project, we were pleasantly surprised by its size. "Three and a half million lines of code," we kept saying, studying the project's statistics. There might be so many findings. Alas and alack! The project turned out to be crafty. So what was the zest of the project and how it was checked - read in this article.

About the project

I'm writing this article following up my previous one, which was also about a project related to Microsoft Azure: Azure PowerShell: mostly harmless. So, this time I was betting on a solid number of diverse and interesting errors. After all, project size is a very important factor in terms of static analysis, in particular when checking a project for the first time. In fact, in practice, one-time checks application isn't the right approach. Nevertheless, if developers go for it, it only takes place at the stage of analyzer introduction. At the same time, no one works their butt off sorting out the enormous number of warnings and just dally them off as technical debt using mass warning suppressing mechanisms and storing them in special bases. Speaking of which, having a great number of warnings is fine when running the analyzer for the first time. As for us, we go for one-time checks for research purposes. For this reason, large projects are always more preferable for the following analysis as compared with small ones.

However, the Azure SDK for .NET project immediately proved to be an unviable test bed. Even its impressive size didn't help, but rather complicated working on it. The reason is given in the following project statistics:

  • .cs source files (not including tests): 16 500
  • Visual Studio Solutions (.sln): 163
  • Non-empty lines of code: 3 462 000
  • Of these auto-generated: about 3,300,000
  • The project repository is available on GitHub.

Approximately 95% of the code is generated automatically, and much of that code is repeated many times. Checking such projects with a static analyzer is usually time-consuming and useless, as there is a lot of workable, but illogical (at least at a first glance) and redundant code. This leads to a great number of false positives.

All that amount of code scattered all over 163 Visual Studio solutions became the "cherry on top". It took some efforts to check the remaining code (not auto-generated). What really helped was the fact that all auto-generated code was stored in solutions subdirectories by the relative path "<Solution directory>\src\Generated". Also each .cs file of such type contains a special comment in the tag <auto-generated>:

// <auto-generated>
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License. See License.txt in the project root for
// license information.
//
// Code generated by Microsoft (R) AutoRest Code Generator.
// Changes may cause incorrect behavior and will be lost if the code is
// regenerated.
// </auto-generated>

For the purity of the experiment, I patchily checked about ten randomly selected auto-generated solutions. I'll tell about the result later.

So, despite the small amount of remaining "honest" code, I still managed to find a number of errors from what remained. This time I'm not going to cite warnings in the order of PVS-Studio diagnostics' codes. Instead, I'll group the messages on the solutions in which they've been found.

Well, let's see what I managed to find in the Azure SDK for .NET code.

Microsoft.Azure.Management.Advisor

This is one of many solutions that contains auto-generated code. As I said earlier, I randomly checked about a dozen of such solutions. In each case, warnings were the same, and, as expected, useless. Here is a couple of examples.

V3022 Expression 'Credentials != null' is always true. AdvisorManagementClient.cs 204

// Code generated by Microsoft (R) AutoRest Code Generator.
....
public ServiceClientCredentials Credentials { get; private set; }
....
public AdvisorManagementClient(ServiceClientCredentials credentials,
  params DelegatingHandler[] handlers) : this(handlers)
{
  if (credentials == null)
  {
    throw new System.ArgumentNullException("credentials");
  }
  Credentials = credentials;
  if (Credentials != null)    // <=
  {
    Credentials.InitializeServiceClient(this);
  }
}

Obviously, this code is redundant and the Credentials != null check is pointless. Nevertheless, the code works. And is auto-generated. For this reason, no complaints here.

V3022 Expression '_queryParameters.Count > 0' is always false. ConfigurationsOperations.cs 871

// Code generated by Microsoft (R) AutoRest Code Generator.
....
public async Task<AzureOperationResponse<IPage<ConfigData>>>
  ListBySubscriptionNextWithHttpMessagesAsync(....)
{
  ....
  List<string> _queryParameters = new List<string>();
  if (_queryParameters.Count > 0)
  {
    ....
  }
  ....
}

Again, it seems like an illogical construction. For some reason, code authors check the size of the newly created empty list. In fact, it's all correct. At this point, the check makes no sense, but in case if developers add list generation, for example, based on another collection, the check will definitely be worth-while. Again - no claims to the code, of course, with regards to its origin.

Hundreds of similar warnings have been issued for each auto-generated solution. Given their futility, I think there is no point in further discussing such cases. Next, only real errors in the "normal" code will be considered.

Azure.Core

V3001 There are identical sub-expressions 'buffer.Length' to the left and to the right of the '<' operator. AzureBaseBuffersExtensions.cs 30

public static async Task WriteAsync(...., ReadOnlyMemory<byte> buffer, ....)
{
  byte[]? array = null;
  ....
  if (array == null || buffer.Length < buffer.Length)  // <=
  {
    if (array != null)
      ArrayPool<byte>.Shared.Return(array);
    array = ArrayPool<byte>.Shared.Rent(buffer.Length);
  }
  if (!buffer.TryCopyTo(array))
    throw new Exception("could not rent large enough buffer.");
  ....
}

The error in the condition was probably the result of copy-paste. According to the fact that buffer is copied in array, the check should look like:

if (array == null || array.Length < buffer.Length)

Anyway, as I always say, the code author should deal with fixing such errors.

V3083 Unsafe invocation of event '_onChange', NullReferenceException is possible. Consider assigning event to a local variable before invoking it. ClientOptionsMonitor.cs 44

private event Action<TOptions, string> _onChange;
....
private void InvokeChanged(....)
{
  ....
  if (_onChange != null)
  {
    _onChange.Invoke(options, name);
  }
}

Not critical, but an error is here. The consumer might unsubscribe from the event between checking the event for null and its invocation. Then the _onChange variable will be null and an exception will be thrown. This code has to be rewritten in a safer way. For example, as follows:

private void InvokeChanged(....)
{
  ....
  _onChange?.Invoke(options, name);
}

Azure.Messaging.EventHubs

V3080 Possible null dereference. Consider inspecting 'eventPropertyValue'. AmqpMessageConverter.cs 650

private static bool TryCreateEventPropertyForAmqpProperty(
  object amqpPropertyValue,
  out object eventPropertyValue)
{
  eventPropertyValue = null;
  ....
  switch (GetTypeIdentifier(amqpPropertyValue))
  {
    case AmqpProperty.Type.Byte:
    ....
    case AmqpProperty.Type.String:
      eventPropertyValue = amqpPropertyValue;
      return true;
    ....
  }
  ....
  switch (amqpPropertyValue)
  {
    case AmqpSymbol symbol:
      eventPropertyValue = ....;
      break;

    case byte[] array:
      eventPropertyValue = ....;
      break;

    case ArraySegment<byte> segment when segment.Count == segment.Array.Length:
      eventPropertyValue = ....;
      break;

    case ArraySegment<byte> segment:
      ....
      eventPropertyValue = ....;
      break;

    case DescribedType described when (described.Descriptor is AmqpSymbol):
      eventPropertyValue = ....;
      break;

    default:
      var exception = new SerializationException(
        string.Format(...., eventPropertyValue.GetType().FullName));  // <=
      ....
  }

  return (eventPropertyValue != null);
}

Let's see what happens with the eventPropertyValue variable value in the given code fragment. The variable is assigned null at the beginning of the method. Further, in one of the first switch conditions, the variable is initialized, after which the method exits. The second switch block contains many conditions, in each of which the variable also receives a new value. Whereas in the default block, the eventPropertyValue variable is used without any check, which is a mistake, as the variable is null at this moment.

V3066 Possible incorrect order of arguments passed to 'EventHubConsumer' constructor: 'partitionId' and 'consumerGroup'. TrackOneEventHubClient.cs 394

public override EventHubConsumer CreateConsumer(....)
{
  return new EventHubConsumer
  (
    new TrackOneEventHubConsumer(....),
    TrackOneClient.EventHubName,
    partitionId,                  // <= 3
    consumerGroup,                // <= 4
    eventPosition,
    consumerOptions,
    initialRetryPolicy
  );
}

The analyzer suspected confused order of the third and fourth arguments when calling the EventHubConsumer class constructor. So let's check this constructor declaration out:

internal EventHubConsumer(TransportEventHubConsumer transportConsumer,
                          string eventHubName,
                          string consumerGroup,         // <= 3
                          string partitionId,           // <= 4
                          EventPosition eventPosition,
                          EventHubConsumerOptions consumerOptions,
                          EventHubRetryPolicy retryPolicy)
{
  ....
}

Indeed, arguments are mixed up. I would venture to suggest how the error was made. Perhaps, incorrect code formatting is to blame here. Just take another look at the EventHubConsumer constructor declaration. Due to the fact that the first transportConsumer parameter is on the same line with the class name, it may seem that the partitionId parameter is at the third place, not the fourth (my comments with the parameter numbers are not available in the original code). That's just a guess, but I'd change the constructor code formatting to the following:

internal EventHubConsumer
(
  TransportEventHubConsumer transportConsumer,
  string eventHubName,
  string consumerGroup,
  string partitionId,
  EventPosition eventPosition,
  EventHubConsumerOptions consumerOptions,
  EventHubRetryPolicy retryPolicy)
{
  ....
}

Azure.Storage

V3112 An abnormality within similar comparisons. It is possible that a typo is present inside the expression 'ContentLanguage == other.ContentEncoding'. BlobSasBuilder.cs 410

public struct BlobSasBuilder : IEquatable<BlobSasBuilder>
{
  ....
  public bool Equals(BlobSasBuilder other) =>
    BlobName == other.BlobName &&
    CacheControl == other.CacheControl &&
    BlobContainerName == other.BlobContainerName &&
    ContentDisposition == other.ContentDisposition &&
    ContentEncoding == other.ContentEncoding &&         // <=
    ContentLanguage == other.ContentEncoding &&         // <=
    ContentType == other.ContentType &&
    ExpiryTime == other.ExpiryTime &&
    Identifier == other.Identifier &&
    IPRange == other.IPRange &&
    Permissions == other.Permissions &&
    Protocol == other.Protocol &&
    StartTime == other.StartTime &&
    Version == other.Version;
}

A mistake made by inattention. Finding such an error with code review is quite difficult. Here is the correct version of code:

    ....
    ContentEncoding == other.ContentEncoding &&
    ContentLanguage == other.ContentLanguage &&
    ....

V3112 An abnormality within similar comparisons. It is possible that a typo is present inside the expression 'ContentLanguage == other.ContentEncoding'. FileSasBuilder.cs 265

public struct FileSasBuilder : IEquatable<FileSasBuilder>
{
  ....
  public bool Equals(FileSasBuilder other)
    => CacheControl == other.CacheControl
    && ContentDisposition == other.ContentDisposition
    && ContentEncoding == other.ContentEncoding         // <=
    && ContentLanguage == other.ContentEncoding         // <=
    && ContentType == other.ContentType
    && ExpiryTime == other.ExpiryTime
    && FilePath == other.FilePath
    && Identifier == other.Identifier
    && IPRange == other.IPRange
    && Permissions == other.Permissions
    && Protocol == other.Protocol
    && ShareName == other.ShareName
    && StartTime == other.StartTime
    && Version == other.Version
    ;

There is exactly the same error in a very similar piece of code. The code might have been copied and partially changed. But the error remained.

Microsoft.Azure.Batch

V3053 An excessive expression. Examine the substrings 'IList' and 'List'. PropertyData.cs 157

V3053 An excessive expression. Examine the substrings 'List' and 'IReadOnlyList'. PropertyData.cs 158

public class PropertyData
{
  ....
  public bool IsTypeCollection => this.Type.Contains("IList") ||
                                  this.Type.Contains("IEnumerable") ||
                                  this.Type.Contains("List") ||        // <=
                                  this.Type.Contains("IReadOnlyList"); // <=
}

The analyzer issued two warnings about pointless or erroneous checks. In the first case, search for the "List" substring after searching for "IList" looks redundant. It's true, this condition:

this.Type.Contains("IList") || this.Type.Contains("List")

can be well changed for the following:

this.Type.Contains("List")

In the second case, search for the "IReadOnlyList" substring is pointless, as previously a shorter substring "List" is searched for.

There is also a chance that search substrings themselves have errors and there should be something else. Anyway, only the code author is to suggest the correct code version taking into account both comments.

V3095 The 'httpRequest.Content.Headers' object was used before it was verified against null. Check lines: 76, 79. BatchSharedKeyCredential.cs 76

public override Task ProcessHttpRequestAsync(
  HttpRequestMessage httpRequest, ....)
{
  ....
  signature.Append(httpRequest.Content != null
    && httpRequest.Content.Headers.Contains("Content-Language") ? .... :  
                                                                  ....;

  long? contentLength = httpRequest.Content?.Headers?.ContentLength;
  ....
}

The httpRequest.Content.Headers variable is first used without any checks, but later it is addressed using the conditional access operator.

V3125 The 'omPropertyData' object was used after it was verified against null. Check lines: 156, 148. CodeGenerationUtilities.cs 156

private static string GetProtocolCollectionToObjectModelCollectionString(
  ...., PropertyData omPropertyData, ....)
{
  if (IsMappedEnumPair(omPropertyData?.GenericTypeParameter, ....))
  {
    ....
  }

  if (IsTypeComplex(omPropertyData.GenericTypeParameter))
  ....
}

And here is a reverse situation. One code block contains safe access variant to the omPropertyData potentially null reference. Further in the code, this reference is handled without any checks.

V3146 Possible null dereference of 'value'. The 'FirstOrDefault' can return default null value. BatchSharedKeyCredential.cs 127

public override Task
  ProcessHttpRequestAsync(HttpRequestMessage httpRequest, ....)
{
  ....
  foreach (string canonicalHeader in customHeaders)
  {
    string value = httpRequest.Headers.
                   GetValues(canonicalHeader).FirstOrDefault();
    value = value.Replace('\n', ' ').Replace('\r', ' ').TrimStart();
    ....
  }
  ....
}

Due to the FirstOrDefault method, if the search fails, the default value will be returned, which is null for the string type. The value will be assigned to the value variable, which is then used in the code with the Replace method without any checks. The code should be made safer. For example, as follows:

foreach (string canonicalHeader in customHeaders)
{
  string value = httpRequest.Headers.
                 GetValues(canonicalHeader).FirstOrDefault();
  value = value?.Replace('\n', ' ').Replace('\r', ' ').TrimStart();
  ....
}

Microsoft.Azure.ServiceBus

V3121 An enumeration 'BlocksUsing' was declared with 'Flags' attribute, but does not set any initializers to override default values. Fx.cs 69

static class Fx
{
  ....
  public static class Tag
  {
    ....
    [Flags]
    public enum BlocksUsing
    {
      MonitorEnter,
      MonitorWait,
      ManualResetEvent,
      AutoResetEvent,
      AsyncResult,
      IAsyncResult,
      PInvoke,
      InputQueue,
      ThreadNeutralSemaphore,
      PrivatePrimitive,
      OtherInternalPrimitive,
      OtherFrameworkPrimitive,
      OtherInterop,
      Other,

      NonBlocking,
    }
    ....
  }
  ....
}

The enumeration is declared with the Flags attribute. At the same time, constant values are left by default (MonitorEnter = 0, MonitorWait = 1, ManualResetEvent = 2 and so on). This may result in the following case: when trying to use flags combination, for example, the second and the third constants MonitorWait (=1) | ManualResetEvent (=2), not a unique value will be received, but the constant with the value 3 by default (AutoResetEvent). This may come as a surprise for the caller code. If the BlocksUsing enumeration is really to be used for setting flags combinations (bit field), constants should be given values, equal to number that are powers of two.

[Flags]
public enum BlocksUsing
{
  MonitorEnter = 1,
  MonitorWait = 2,
  ManualResetEvent = 4,
  AutoResetEvent = 8,
  AsyncResult = 16,
  IAsyncResult = 32,
  PInvoke = 64,
  InputQueue = 128,
  ThreadNeutralSemaphore = 256,
  PrivatePrimitive = 512,
  OtherInternalPrimitive = 1024,
  OtherFrameworkPrimitive = 2048,
  OtherInterop = 4096,
  Other = 8192,

  NonBlocking = 16384,
}

V3125 The 'session' object was used after it was verified against null. Check lines: 69, 68. AmqpLinkCreator.cs 69

public async Task<Tuple<AmqpObject, DateTime>> CreateAndOpenAmqpLinkAsync()
{
  ....
  AmqpSession session = null;
  try
  {
    // Create Session
    ....
  }
  catch (Exception exception)
  {
    ....
    session?.Abort();
    throw AmqpExceptionHelper.GetClientException(exception, null,
      session.GetInnerException(), amqpConnection.IsClosing());
  }
  ....
}

Pay attention to the session variable handling in the catch block. The Abort method is called safely by the conditional access operator. But after the GetInnerException method is called unsafely. In doing so, NullReferenceException might be thrown instead of an exception of the expected type. This code has to be fixed. The AmqpExceptionHelper.GetClientException method supports passing the null value for the innerException parameter:

public static Exception GetClientException(
  Exception exception, 
  string referenceId = null, 
  Exception innerException = null, 
  bool connectionError = false)
{
  ....
}

Therefore, one can use only the conditional access operator when calling session.GetInnerException():

public async Task<Tuple<AmqpObject, DateTime>> CreateAndOpenAmqpLinkAsync()
{
  ....
  AmqpSession session = null;
  try
  {
    // Create Session
    ....
  }
  catch (Exception exception)
  {
    ....
    session?.Abort();
    throw AmqpExceptionHelper.GetClientException(exception, null,
      session?.GetInnerException(), amqpConnection.IsClosing());
  }
  ....
}

Conclusion

As you can see, a large project size doesn't always guarantee a lot of errors. However, we stay alert since we can always find something. Even in a project as structurally complex as Azure SDK for .NET. Finding some crucial defects requires additional efforts. But the more difficulties the more pleasant the result. On the other hand, to avoid undue efforts, we recommend using static analysis right on developers' computers when writing new code. This is the most effective approach. Download and try PVS-Studio in action. Good luck in fighting bugs!