Nikita Lipilin

Aug 17 2021

Tags:

#CSharp #Compiler

Is PHP compilable?! PVS-Studio searches for errors in PeachPie

Aug 17 2021

Author: Nikita Lipilin

WriteLine problems
Same cases in switch
Immutable DateTime
Try-methods with peculiarities
Default argument value
Copy-paste sends greetings
Just an extra check in the if statement
Could there be null?
Fatal warning and exceptions
Another extra check? An exception again!
Assignments and reassignments
These immutable strings...
or throw != or null
Strange 'while true'
data == null && throw NullReferenceException
Wrong exception
Don't blame String.Join!
Tiny assignment in a long method
null dereference in a boolean expression
That's it!

PHP is widely known as an interpreted programming language used mainly for website development. However, few people know that PHP also has a compiler to .NET – PeachPie. But how well is it made? Will the static analyzer be able to find actual bugs in this compiler? Let's find out!

It's been a while since we posted articles on the C# projects check using PVS-Studio... And we still have to make the 2021 Top list of bugs (by the way, 2020 Top 10 bugs, you can find here)! Well, we need to mend our ways. I am excited to show you a review of the PeachPie check results.

To begin with, let me tell you a little about the project. PeachPie is a modern, open-source PHP language compiler and runtime for the .NET Framework and .NET. It is built on top of the Microsoft Roslyn compiler platform and is based on the first-generation Phalanger project. In July 2017, the project became a member of the .NET Foundation. The source code is available in the GitHub repository.

By the way, our C# analyzer also makes extensive use of the Roslyn capabilities, so in a way, PeachPie and PVS-Studio have something in common :). We've worked with Roslyn before. Moreover, we wrote a whole article about the basics of working with this platform.

To check PeachPie, we had to install the analyzer, open the project in Visual Studio or Rider and run the analysis using the PVS-Studio plugin. For more details, see the documentation.

It was entertaining to check such a large and serious project. I hope you'll also enjoy my review of the bugs found in PeachPie. Have fun reading!

WriteLine problems

Well, let's start with an easy one :) Sometimes bugs can appear in the most unexpected and at the same time the simplest places. For example, an error may even appear in an easy WriteLine function call:

public static bool mail(....)
{
  // to and subject cannot contain newlines, replace with spaces
  to = (to != null) ? to.Replace("\r\n", " ").Replace('\n', ' ') : "";
  subject = (subject != null) ? subject.Replace("\r\n", " ").Replace('\n', ' ')
                              : "";

  Debug.WriteLine("MAILER",
                  "mail('{0}','{1}','{2}','{3}')",
                  to,
                  subject,
                  message, 
                  additional_headers);

  var config = ctx.Configuration.Core;
  
  ....
}

The V3025 warning: Incorrect format. A different number of format items is expected while calling 'WriteLine' function. Arguments not used: 1st, 2nd, 3rd, 4th, 5th. Mail.cs 25

You would think, what has gone wrong? Everything seems to be fine. Wait a minute, though! What argument should pass the format?

Well, let's take a look at the Debug.WriteLine declaration:

public static void WriteLine(string format, params object[] args);

The format string should be passed as the first argument, and the first argument in the code is "MAILER". Obviously, the developer mixed up the methods and passed the arguments incorrectly.

Same cases in switch

This section is devoted to warnings associated with performing the same actions in different case branches:

private static FlowAnalysisAnnotations DecodeFlowAnalysisAttributes(....)
{
  var result = FlowAnalysisAnnotations.None;

  foreach (var attr in attributes)
  {
    switch (attr.AttributeType.FullName)
    {
      case "System.Diagnostics.CodeAnalysis.AllowNullAttribute":
        result |= FlowAnalysisAnnotations.AllowNull;
        break;
      case "System.Diagnostics.CodeAnalysis.DisallowNullAttribute":
        result |= FlowAnalysisAnnotations.DisallowNull;
        break;
      case "System.Diagnostics.CodeAnalysis.MaybeNullAttribute":
        result |= FlowAnalysisAnnotations.MaybeNull;
        break;
      case "System.Diagnostics.CodeAnalysis.MaybeNullWhenAttribute":
        if (TryGetBoolArgument(attr, out bool maybeNullWhen))
        {
          result |= maybeNullWhen ? FlowAnalysisAnnotations.MaybeNullWhenTrue
                                  : FlowAnalysisAnnotations.MaybeNullWhenFalse;
        }
        break;
      case "System.Diagnostics.CodeAnalysis.NotNullAttribute":
        result |= FlowAnalysisAnnotations.AllowNull;
        break;
    }
  }
}

This fragment contains if not an error, then at least a strange thing. How quickly can you find it?

However, don't waste your time, the analyzer found everything for us:

private static FlowAnalysisAnnotations DecodeFlowAnalysisAttributes(....)
{
  var result = FlowAnalysisAnnotations.None;

  foreach (var attr in attributes)
  {
    switch (attr.AttributeType.FullName)
    {
      case "System.Diagnostics.CodeAnalysis.AllowNullAttribute":
        result |= FlowAnalysisAnnotations.AllowNull;
        break;
      ....
      case "System.Diagnostics.CodeAnalysis.NotNullAttribute":
        result |= FlowAnalysisAnnotations.AllowNull;              // <=
        break;
    }
  }
}

The V3139 warning: Two or more case-branches perform the same actions. ReflectionUtils.Nullability.cs 170

Isn't it strange that two different cases are handled the same way? In fact, no, this happens quite often. However, there are 2 peculiarities.

Firstly, it's worth noting that there is more graceful way to treat two different cases in the same way. You can rewrite the above fragment as follows:

switch (attr.AttributeType.FullName)
{
  case "System.Diagnostics.CodeAnalysis.AllowNullAttribute":
  case "System.Diagnostics.CodeAnalysis.NotNullAttribute":
    result |= FlowAnalysisAnnotations.AllowNull;
    break;
  ....
}

However, developers often neglect this convenient method and prefer copy-paste. Therefore, the presence of two identical branches doesn't seem so terrible. The fact that the FlowAnalysisAnnotations enumeration has, among others, the FlowAnalysisAnnotations.NotNull value is much more suspicious. This value seems to be used when the "System.Diagnostics.CodeAnalysis.NotNullAttribute" value is processed:

switch (attr.AttributeType.FullName)
{
  case "System.Diagnostics.CodeAnalysis.AllowNullAttribute":
    result |= FlowAnalysisAnnotations.AllowNull;
    break;
  ....
  case "System.Diagnostics.CodeAnalysis.NotNullAttribute":
    result |= FlowAnalysisAnnotations.NotNull;              // <=
    break;
}

Immutable DateTime

Developers often make mistakes because they don't understand how the features of the "modifying" methods work. Here is the bug found in PeachPie:

using System_DateTime = System.DateTime;

internal static System_DateTime MakeDateTime(....) { .... }

public static long mktime(....)
{
  var zone = PhpTimeZone.GetCurrentTimeZone(ctx);
  var local = MakeDateTime(hour, minute, second, month, day, year);

  switch (daylightSaving)
  {
    case -1:
      if (zone.IsDaylightSavingTime(local))
        local.AddHours(-1);                   // <=
      break;
    case 0:
      break;
    case 1:
      local.AddHours(-1);                     // <=
      break;
    default:
      PhpException.ArgumentValueNotSupported("daylightSaving", daylightSaving);
      break;
  }
  return DateTimeUtils.UtcToUnixTimeStamp(TimeZoneInfo.ConvertTime(local, 
                                                                   ....));
}

The PVS-Studio warnings:

V3010 The return value of function 'AddHours' is required to be utilized. DateTimeFunctions.cs 1232
V3010 The return value of function 'AddHours' is required to be utilized. DateTimeFunctions.cs 1239

The analyzer reports that the results of calls should be recorded somewhere – otherwise, they do not make any sense. The fact is, methods like AddHours don't change the original object – instead, a new object is returned, and it differs from the original one accordingly. It's difficult to say how critical this mistake is, but it's clear that the code fragment doesn't work correctly.

Try-methods with peculiarities

Try-methods are often super convenient for developing apps in C#. The best-known try-methods are int.TryParse, Dictionary.TryGetValue, etc. Usually, these methods return a flag that indicates the operation success. The result is written to the out parameter. The PeachPie developers decided to implement their try-methods which were supposed to work the same way. What came of it? Let's look at the following code:

internal static bool TryParseIso8601Duration(string str,
                                             out DateInfo result,
                                             out bool negative)
{
  ....
  if (pos >= length) goto InvalidFormat;

  if (s[pos++] != 'P') goto InvalidFormat;

  if (!Core.Convert.TryParseDigits(....))
    goto Error;
  
  if (pos >= length) goto InvalidFormat;

  if (s[pos] == 'Y')
  {
    ....

    if (!Core.Convert.TryParseDigits(....)) 
      goto Error;

    if (pos >= length) goto InvalidFormat;
  }
  ....
  InvalidFormat:
  Error:

    result = default;
    negative = default;
    return false;
}

This method is shortened for readability. You can find the full method by clicking the link. Core.Convert.TryParseDigits is called many times in the method. In cases when such a call returns false, the thread of execution jumps to the Error label, which is logical.

On the Error label, default values are assigned to out-parameters. Then, the method returns false. Everything looks logical – the TryParseIso8601Duration method behaves exactly like standard try-methods. Well... At least, it's what it looks like. In fact, it's not like that :(.

As I mentioned earlier if Core.Convert.TryParseDigits returns false, the code jumps to the Error label, where the bug/issue handling is performed. However, here's the trouble – the analyzer reports that TryParseDigits never returns false:

The V3022 warning: Expression '!Core.Convert.TryParseDigits(s, ref pos, false, out value, out numDigits)' is always false. DateTimeParsing.cs 1440

If the negation of the call result is always false, then the call always returns true. What a specific behavior for the try-method! Is the operation always successful? Let's finally look at TryParseDigits:

public static bool TryParseDigits(....)
{
  Debug.Assert(offset >= 0);

  int offsetStart = offset;
  result = 0;
  numDigits = 0;

  while (....)
  {
    var digit = s[offset] - '0';

    if (result > (int.MaxValue - digit) / 10)
    {
      if (!eatDigits)
      {
        // overflow
        //return false;
        throw new OverflowException();
      }

      ....

      return true;
    }

    result = result * 10 + digit;
    offset++;
  }

  numDigits = offset - offsetStart;
  return true;
}

The method does always return true. But the operation may fail – in this case, an exception of the OverflowException type is thrown. As for me, this is clearly not what you expect from a try-method :). By the way, there is a line with return false, but it is commented out.

Perhaps, the use of an exception here is somehow justified. But according to the code, it seems that something went wrong. TryParseDigits and TryParseIso8601Duration using it are supposed to work like the usual try-methods – return false in case of failure. Instead, they throw unexpected exceptions.

Default argument value

The following analyzer message is simpler, but it also points to a rather strange code fragment:

private static bool Put(Context context,
                        PhpResource ftp_stream,
                        string remote_file,
                        string local_file,
                        int mode,
                        bool append,
                        int startpos)
{ .... }

public static bool ftp_put(Context context,
                           PhpResource ftp_stream,
                           string remote_file,
                           string local_file,
                           int mode = FTP_IMAGE,
                           int startpos = 0)
{
    return Put(context,
               ftp_stream,
               remote_file,
               local_file,
               mode = FTP_IMAGE, // <=
               false,
               startpos);
}

The V3061 warning: Parameter 'mode' is always rewritten in method body before being used. Ftp.cs 306

The ftp_put method accepts a number of parameters as input, one of the parameters is mode. It has a default value, but when it's called, clearly, you can set another value. However, this doesn't affect anything – mode is always overwritten, and the Put method always receives the value of the FTP_IMAGE constant.

It's difficult to say why everything is written this way – the construct seems meaningless. An error is most likely to be here.

Copy-paste sends greetings

The following code fragment looks like a copy-paste victim:

public static PhpValue filter_var(....)
{
  ....
  if ((flags & (int)FilterFlag.NO_PRIV_RANGE) == (int)FilterFlag.NO_PRIV_RANGE)
  {
    throw new NotImplementedException();
  }

  if ((flags & (int)FilterFlag.NO_PRIV_RANGE) == (int)FilterFlag.NO_RES_RANGE)
  {
    throw new NotImplementedException();
  }
  ....
}

The V3127 warning: Two similar code fragments were found. Perhaps, this is a typo and 'NO_RES_RANGE' variable should be used instead of 'NO_PRIV_RANGE' Filter.cs 771

It seems, the second condition had to be written this way:

(flags & (int)FilterFlag.NO_RES_RANGE) == (int)FilterFlag.NO_RES_RANGE

Anyway, this option looks more logical and clear.

Just an extra check in the if statement

Let's diversify our article with usual redundant code:

internal static NumberInfo IsNumber(....)
{
  ....
  int num = AlphaNumericToDigit(c);

  // unexpected character:
  if (num <= 15)
  {
    if (l == -1)
    {
      if (   longValue < long.MaxValue / 16 
          || (   longValue == long.MaxValue / 16 
              && num <= long.MaxValue % 16))         // <=
      {
        ....
      }
      ....
    }
    ....
  }
  ....
}

The V3063 warning: A part of conditional expression is always true if it is evaluated: num <= long.MaxValue % 16. Conversions.cs 994

Firstly, I'd like to say that the function code is significantly shortened for readability. Click on the link to see the full IsNumber source code – but let me warn you – it's not easy to read. The function contains more than 300 code lines. It seems to go beyond accepted "one screen" :).

Let's move on to the warning. In the outer block the value of the num variable is checked – it must be less than or equal to 15. In the inner block num is checked – it must be less than or equal to long.MaxValue % 16. In doing so, the value of this expression is 15 – it's easy to check. The code turns out to check twice that num is less than or equal to 15.

This warning hardly indicates a real bug – someone just wrote an extra check. Maybe it was done on purpose – for example, to ease the reading of this exact code. Although the use of some variable or constant to store the comparison result seems to be an easier option. Anyway, the construct is redundant, and it's the static analyzer duty to report this.

Could there be null?

Developers often miss checks for null. The situation is particularly interesting when a variable was checked in one place of the function, and in another (where it can still be null) – they forgot or didn't find it necessary. And here we can only guess whether the check was redundant or there was a lack of it in some places. Null checks do not always involve the use of comparison operators – for example, the code fragment below shows that the developer used the null-conditional operator:

public static string get_parent_class(....)
{
  if (caller.Equals(default))
  {
    return null;
  }

  var tinfo = Type.GetTypeFromHandle(caller)?.GetPhpTypeInfo();
  return tinfo.BaseType?.Name;
}

The V3105 warning: The 'tinfo' variable was used after it was assigned through null-conditional operator. NullReferenceException is possible. Objects.cs 189

According to the developer, the Type.GetTypeFromHandle(caller) call can return null – that's why "?." was used to call GetPhpTypeInfo. The documentation proves that it's possible.

Yay, "?." saves from one exception. If the GetTypeFromHandle call returns null, then the tinfo variable is also assigned null. But when you try to access the BaseType property, another exception is thrown. Most likely, the last line misses another "?":

return tinfo?.BaseType?.Name;

Fatal warning and exceptions

Get ready, in this part you will find a real investigation...

Here we have another warning related to null check. The triggering turned out to be much more exciting than it looked at first glance. Take a look at the code fragment:

static HashPhpResource ValidateHashResource(HashContext context)
{
  if (context == null)
  {
    PhpException.ArgumentNull(nameof(context));
  }

  return context.HashAlgorithm;
}

The V3125 warning: The 'context' object was used after it was verified against null. Check lines: 3138, 3133. Hash.cs 3138

Yes, the variable is checked for null, and then the property access without any verification occurs. However, look what happens if the variable value is null:

PhpException.ArgumentNull(nameof(context));

It appears that if the context is equal to null, the execution thread doesn't get to the HashAlgorithm property access. Therefore, this code is safe. Is it a false positive?

Of course, the analyzer can make mistakes. However, I know that PVS-Studio can handle such situations – the analyzer should have known that at the time of accessing HashAlgorithm, the context variable cannot be equal to null.

But what exactly does the PhpException.ArgumentNull call do? Let's take a look:

public static void ArgumentNull(string argument)
{
  Throw(PhpError.Warning, ErrResources.argument_null, argument);
}

Hmm, looks like something is thrown. Pay attention to the first argument of the call — PhpError.Warning. Hmm, well, let's move on to the Throw method:

public static void Throw(PhpError error, string formatString, string arg0)
{
  Throw(error, string.Format(formatString, arg0));
}

Basically, there's nothing interesting here, take a look at another Throw overloading:

public static void Throw(PhpError error, string message)
{
  OnError?.Invoke(error, message);

  // throw PhpFatalErrorException
  // and terminate the script on fatal error
  if ((error & (PhpError)PhpErrorSets.Fatal) != 0)
  {
    throw new PhpFatalErrorException(message, innerException: null);
  }
}

And here is what we are looking for! It turns out that under the hood of this entire system there is PhpFatalErrorException. The exception seems to be thrown occasionally.

Firstly, it's worth looking at the places where the handlers of the OnError event are registered. They can throw exceptions too – that would be a little unexpected, but you never know. There are a few handlers, and they all are related to logging the corresponding messages. One handler is in the PhpHandlerMiddleware file:

PhpException.OnError += (error, message) =>
{
  switch (error)
  {
    case PhpError.Error:
      logger.LogError(message);
      break;

    case PhpError.Warning:
      logger.LogWarning(message);
      break;

    case PhpError.Notice:
    default:
      logger.LogInformation(message);
      break;
  }
};

Two another handlers are in the PhpException class:

// trace output
OnError += (error, message) =>
{
  Trace.WriteLine(message, $"PHP ({error})");
};

// LogEventSource
OnError += (error, message) =>
{
  if ((error & (PhpError)PhpErrorSets.Fatal) != 0)
  {
    LogEventSource.Log.HandleFatal(message);
  }
  else
  {
    LogEventSource.Log.HandleWarning(message);
  }
};

Thus, event handlers do not generate any exceptions. So, let's go back to the Throw method.

public static void Throw(PhpError error, string message)
{
  OnError?.Invoke(error, message);

  // throw PhpFatalErrorException
  // and terminate the script on fatal error
  if ((error & (PhpError)PhpErrorSets.Fatal) != 0)
  {
    throw new PhpFatalErrorException(message, innerException: null);
  }
}

As everything is clear with OnError, let's take a closer look at the condition:

(error & (PhpError)PhpErrorSets.Fatal) != 0

The error parameter stores the value of the PhpError enumeration. Earlier, we noticed that the error parameter receives PhpError.Warning. An exception is thrown if the result of applying "bitwise AND" to the error and PhpErrorSets.Fatal is non-zero.

The PhpErrorSets.Fatal value is a "union" of the PhpError enumeration elements created by the "bitwise OR" operation:

Fatal =   PhpError.E_ERROR | PhpError.E_COMPILE_ERROR
        | PhpError.E_CORE_ERROR | PhpError.E_USER_ERROR

Below you can see the values of all the enumeration elements mentioned earlier:

E_ERROR = 1,
E_WARNING = 2,
E_CORE_ERROR = 16,
E_COMPILE_ERROR = 64,
E_USER_ERROR = 256,
Warning = E_WARNING

The error & (PhpError)PhpErrorSets.Fatal operation returns a non-zero value only if the error parameter has one of the following values or a combination of them:

PhpError.E_ERROR,
PhpError.E_COMPILE_ERROR,
PhpError.E_CORE_ERROR,
PhpError.E_USER_ERROR

If the error parameter contains the PhpError.Warning value that equals PhpError.E_WARNING, the result of the "bitwise AND" operation is zero. Then the condition for throwing PhpFatalErrorException is not met.

Let's go back to the PhpException.ArgumentNull method:

public static void ArgumentNull(string argument)
{
  Throw(PhpError.Warning, ErrResources.argument_null, argument);
}

We found out that when the PhpError.Warning value is passed, there is no exception. Perhaps, the developer didn't want the exception to be thrown in cases when an unexpected null is passed. It's just...

static HashPhpResource ValidateHashResource(HashContext context)
{
  if (context == null)
  {
    PhpException.ArgumentNull(nameof(context)); // no exceptions
  }

  return context.HashAlgorithm; // context is potential null
}

If PhpException.ArgumentNull does not throw an exception (which is unexpected), then when we access the HashAlgorithm property, NullReferenceException occurs anyway!

You might ask: should an exception be thrown or not? If it should, then it makes more sense to use the same PhpFatalErrorException. If no one expects an exception here, then you need to correctly process the null value of the context parameter. For example, you can use "?.". Anyway, the analyzer dealt with this situation and even helped to understand the issue.

Another extra check? An exception again!

The last case proves that expecting an exception, you can get an unexpected null. The fragment below shows the opposite case:

public PhpValue offsetGet(PhpValue offset)
{
  var node = GetNodeAtIndex(offset);

  Debug.Assert(node != null);

  if (node != null)
    return node.Value;
  else
    return PhpValue.Null;
}

The V3022 warning: Expression 'node != null' is always true. Datastructures.cs 432

Well, there's no null here, then so be it! Why grumble? However, usually null is expected in cases when something is wrong. The code shows that this is exactly the case. But the analyzer insists that there couldn't be null.

You might think, it's all about the Debug.Assert call in this case. For better or for worse, this call doesn't affect the analyzer warnings.

If it's not about Debug.Assert, then what is it about? Why does the analyzer "thinks" that node is never equal to null? Let's take a look at the GetNodeAtIndex method, that returns the value written to node:

private LinkedListNode<PhpValue> GetNodeAtIndex(PhpValue index)
{
  return GetNodeAtIndex(GetValidIndex(index));
}

Well, we go deeper. Take a look at the GetNodeAtIndex method called here:

private LinkedListNode<PhpValue> GetNodeAtIndex(long index)
{
  var node = _baseList.First;
  while (index-- > 0 && node != null)
  {
    node = node.Next;
  }

  return node ?? throw new OutOfRangeException();
}

Look! It seems that the method could return null... No such luck! If the loop is terminated, and node is equal to null, an exception is thrown. This way, no null can be returned.

In case of an unexpected situation, the GetNodeAtIndex method doesn't return null, as expected in the offsetGet method code:

public PhpValue offsetGet(PhpValue offset)
{
  var node = GetNodeAtIndex(offset); // potential null expected

  Debug.Assert(node != null);

  if (node != null) // always true
    return node.Value;
  else
    return PhpValue.Null; // unreachable
}

When a developer reviews this method, they can easily get deceived. According to the code fragment, it seems that the correct value or PhpValue.Null is returned. In fact, this method can throw an exception.

The unexpected behavior of only one method in the call chain leads to unexpected behavior of all these methods – such a troublemaker! This example illustrates how useful static analysis is. It finds such problems automatically.

By the way, there is a similar problem in the offsetSet method from the same class:

public void offsetSet(PhpValue offset, PhpValue value)
{
  var node = GetNodeAtIndex(offset);

  Debug.Assert(node != null);

  if (node != null)
    node.Value = value;
}

The V3022 warning: Expression 'node != null' is always true. Datastructures.cs 444

Assignments and reassignments

Why don't we take a little break from all these investigations and have a cup of coffee?

While we're drinking coffee, let's take a look at a simple warning that indicates a weird code fragment:

internal StatStruct(Mono.Unix.Native.Stat stat)
{
  st_dev = (uint)stat.st_dev;
  st_ctime = stat.st_ctime_nsec;
  st_mtime = stat.st_mtime_nsec;
  st_atime = stat.st_atime_nsec;
  st_ctime = stat.st_ctime;
  st_atime = stat.st_atime;
  //stat.st_blocks;
  //stat.st_blksize;
  st_mtime = stat.st_mtime;
  st_rdev = (uint)stat.st_rdev;
  st_gid = (short)stat.st_gid;
  st_uid = (short)stat.st_uid;
  st_nlink = (short)stat.st_nlink;
  st_mode = (FileModeFlags)stat.st_mode;
  st_ino = (ushort)stat.st_ino;
  st_size = stat.st_size;
}

The PVS-Studio warnings:

V3008 The 'st_ctime' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 78, 75. StatStruct.cs 78
V3008 The 'st_atime' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 79, 77. StatStruct.cs 79

Looks like the developer got tangled in all these assignments and made a typo somewhere. Due to this, st_ctime and st_atime fields receive the values twice – and the second value is not the same as the first one.

It's an error, isn't it? But that's no fun! I suggest you practice your skills and search for deeper meaning. Then try to explain in the comments why everything is the way it is.

Meanwhile, let's keep going :)

These immutable strings...

At the very beginning of this article, when you were reading about the first warnings, we mentioned the immutability of DateTime structure instances. The following warnings remind us of a similar strings feature:

public TextElement Filter(IEncodingProvider enc,
                          TextElement input,
                          bool closing)
{
  string str = input.AsText(enc.StringEncoding);

  if (pending)
  {
    if (str.Length == 0) str = "\r";
    else if (str[0] != '\n') str.Insert(0, "\r"); // <=
  }

  str = str.Replace("\r\n", "\n");
  if (str.Length != 0)
  {
    pending = str[str.Length - 1] == '\r';

    if (!closing && pending) str.Remove(str.Length - 1, 1); // <=
  }

    
  return new TextElement(str);
}

The PVS-Studio warnings:

V3010 The return value of function 'Insert' is required to be utilized. Filters.cs 150
V3010 The return value of function 'Remove' is required to be utilized. Filters.cs 161

Everything is simple and clear – we wanted to modify the string, but something... went wrong :(.

or throw != or null

Recently, we analyzed a case when a developer expected the function to return null but got an exception instead. Here is something similar but simpler:

public static bool stream_wrapper_register(....)
{
  // check if the scheme is already registered:
  if (   string.IsNullOrEmpty(protocol)
      || StreamWrapper.GetWrapperInternal(ctx, protocol) == null)
  {
    // TODO: Warning?
    return false;
  }

  var wrapperClass = ctx.GetDeclaredTypeOrThrow(classname, true);
  if (wrapperClass == null) // <=
  {
    return false;
  }

  ....
}

The V3022 warning: Expression 'wrapperClass == null' is always false. Streams.cs 555

Of course, you can analyze it in detail, but... The method's name says it all! GetDeclaredTypeOrThrow sort of hints that it's going to throw an exception if something goes wrong. Again, here's the thing – this behavior is also passed to the stream_wrapper_register method. But the developer wanted this method to return false. No such luck, here is an exception!

In fact, we've already encountered deceptive names before. Do you remember when the PhpException.ArgumentNull method call didn't actually throw an exception? So, let's check whether GetDeclaredTypeOrThrow throws an exception:

PhpTypeInfo GetDeclaredTypeOrThrow(string name, bool autoload = false)
{
  return GetDeclaredType(name, autoload) ??
         throw PhpException.ClassNotFoundException(name);
}

Well, the PeachPie developers didn't try to trick you here – it is a real exception :).

Strange 'while true'

In some cases, developers use the true value as the while loop continuation condition. It seems to be a normal thing to do – to exit the loop, you can use break, return, or exceptions. Actually, the loop that has some expression (instead of the true keyword) as a condition looks far more than weird. The value of this expression always has the true value:

public static int stream_copy_to_stream(...., int offset = 0)
{
  ....
  if (offset > 0)
  {
    int haveskipped = 0;

    while (haveskipped != offset)  // <=
    {
      TextElement data;

      int toskip = offset - haveskipped;
      if (toskip > from.GetNextDataLength())
      {
        data = from.ReadMaximumData();
        if (data.IsNull) break;
      }
      else
      {
        data = from.ReadData(toskip, false);
        if (data.IsNull) break; // EOF or error.
        Debug.Assert(data.Length <= toskip);
      }

      Debug.Assert(haveskipped <= offset);
    }
  }
  ....
}

The V3022 warning: Expression 'haveskipped != offset' is always true. Streams.cs 769

The haveskipped variable is declared before the loop. It is initialized with the 0 value. This value remains with it... until its death. Sounds gloomy but it is what it is. In fact, haveskipped is a constant. The value of the offset parameter also remains the same during the loop performance. And it remains the same in any place of the function (you can check it here).

Did the developer plan to make the loop continuation condition always true? Theoretically, it's possible. But take a closer look at the loop. The following assignment looks strange:

int toskip = offset - haveskipped;

What's the point, if haveskipped is always equal to 0?

Something is wrong with the loop. Either a serious mistake is made here, or all these haveskipped weird things are the remains of some old unaccomplished ideas.

data == null && throw NullReferenceException

Often, the use of incorrect operators in conditions leads to bugs. There's a similar situation in the PHP compiler:

public string ReadStringContents(int maxLength)
{
  if (!CanRead) return null;
  var result = StringBuilderUtilities.Pool.Get();

  if (maxLength >= 0)
  {
    while (maxLength > 0 && !Eof)
    {
      string data = ReadString(maxLength);
      if (data == null && data.Length > 0) break; // EOF or error.
      maxLength -= data.Length;
      result.Append(data);
    }
  }
  ....
}

The V3080 warning: Possible null dereference. Consider inspecting 'data'. PhpStream.cs 1382

The value of the data variable is checked in the loop. If the variable equals null and its Length property has a positive value, then the loop exit occurs. Clearly, it's impossible. Moreover, we have an exception when accessing the Length variable that has the null value. Here, the access deliberately takes place when data = null.

Given the developer's comment, I would rewrite the condition something like this:

data == null || data.Length == 0

However, it doesn't mean that this is the correct handling option – to fix this issue, it's better to do deep code analysis.

Wrong exception

There are also bugs that do not look so terrible but still may cause problems. For example, in the following fragment, copy-paste hits again:

public bool addGlob(....)
{
  PhpException.FunctionNotSupported(nameof(addGlob));
  return false;
}

public bool addPattern(....)
{
  PhpException.FunctionNotSupported(nameof(addGlob));
  return false;
}

The V3013 warning: It is odd that the body of 'addGlob' function is fully equivalent to the body of 'addPattern' function (506, line 515). ZipArchive.cs 506

The addGlob function clearly isn't supported, so when the function is called, there is an exception indicating that the addGlob function isn't supported.

Believing me? Tricked you! There is no exception here. This is our old friend – PhpException:

public static class PhpException
{
  ....
  public static void FunctionNotSupported(string/*!*/function)
  {
    Debug.Assert(!string.IsNullOrEmpty(function));

    Throw(PhpError.Warning,
          ErrResources.notsupported_function_called,
          function);
  }
  ....
}

As we discussed earlier if the Throw method receives the PhpError.Warning value, there is no exception. But still, the appeared error is likely to be added to log or handled in some other way.

Let's go back to the original code fragment:

public bool addGlob(....)
{
  PhpException.FunctionNotSupported(nameof(addGlob));
  return false;
}

public bool addPattern(....)
{
  PhpException.FunctionNotSupported(nameof(addGlob));
  return false;
}

The addGlob function is not supported and when it's called, the corresponding message is handled somehow – let's assume that it is added to the log. The addPattern function isn't supported either, however, the corresponding message is still addressed to addGlob.

Clearly, it's a copy-paste error. It's easy to fix – you just need to report about addPattern, and not about addGlob in the addPattern method:

public bool addPattern(....)
{
  PhpException.FunctionNotSupported(nameof(addPattern));
  return false;
}

Don't blame String.Join!

Sometimes developers forget the features of some functions. That's why they check wrong values. As a result, the check turns out to be meaningless, and there is no check where it has to be. It seems that the same thing happened to the getallheaders function:

public static PhpArray getallheaders(Context ctx)
{
  var webctx = ctx.HttpPhpContext;
  if (webctx != null)
  {
    var headers = webctx.RequestHeaders;
    if (headers != null)
    {
      var result = new PhpArray(16);

      foreach (var h in headers)
      {
        result[h.Key] = string.Join(", ", h.Value) ?? string.Empty;
      }

      return result;
    }
  }

  return null;
}

The V3022 warning: Expression 'string.Join(", ", h.Value)' is always not null. The operator '??' is excessive. Web.cs 932

It's pointless to use the "??" operator here since the string.Join method never returns null. But it can always throw ArgumentNullException (You're welcome!).

string.Join throws an exception if the passed reference to the sequence equals null. Therefore, it's safer to write this line something like that:

result[h.Key] = h.Value != null ? string.Join(", ",h.Value) : string.Empty;

Actually, I want to know, whether it's possible for Value to be null at all? Maybe, we don't have to check anything here. To figure it out, firstly, we need to understand where the headers collection came from.

public static PhpArray getallheaders(Context ctx)
{
  var webctx = ctx.HttpPhpContext;
  if (webctx != null)
  {
    var headers = webctx.RequestHeaders;
    ....
  }

  return null;
}

The headers value is taken from webctx.requestHeaders, and the webctx value is taken from the HttpPhpContext property of the ctx object. And the HttpPhpContext property... Just take a look at this:

partial class Context : IEncodingProvider
{
  ....
  public virtual IHttpPhpContext? HttpPhpContext => null;
  ....
}

This, apparently, is something left for later. If you look at the getallheaders method again, you see that it never works at all and simply returns null.

Believing me again? But the property is virtual! Therefore, to understand what the getallheaders method can return, you need to analyze descendants. Personally, I decided to stop at this point – I still have to show other warnings.

Tiny assignment in a long method

Long and complex methods are likely to contain bugs. Over time, it's difficult for developers to navigate in a large chunk of code, while it's always terrifying to change it. Programmers add new code, the old one remains the same. Somehow this incredible construct works, thankfully. So, no surprise, there is some weirdness in such code. For example, take a look at the inflate_fast method:

internal int inflate_fast(....)
{
  ....
  int r;
  ....
  if (c > e)
  {
    // if source crosses,
    c -= e; // wrapped copy
    if (q - r > 0 && e > (q - r))
    {
      do
      {
        s.window[q++] = s.window[r++];
      }
      while (--e != 0);
    }
    else
    {
      Array.Copy(s.window, r, s.window, q, e);
      q += e; r += e; e = 0;                     // <=
    }
    r = 0;                                       // <=
  }
  ....
}

The V3008 warning: The 'r' variable is assigned values twice successfully. Perhaps this is a mistake. Check lines: 621, 619. InfCodes.cs 621

To begin with, here is a link to full code. The method has more than two hundred lines of code with a bunch of nested constructs. It seems that it would be difficult to puzzle it out.

The warning is unambiguous – first, a new value is assigned to the r variable in the block, and then it is definitely overwritten with zero. It's hard to say what exactly is wrong here. Either the nullifying works somehow wrong, or the r += e construction is superfluous here.

null dereference in a boolean expression

Earlier, we discussed the case when an incorrectly constructed logical expression leads to an exception. Here is another example of such a warning:

public static bool IsAutoloadDeprecated(Version langVersion)
{
  // >= 7.2
  return    langVersion != null && langVersion.Major > 7 
         || (langVersion.Major == 7 && langVersion.Minor >= 2);
}

The V3080 warning: Possible null dereference. Consider inspecting 'langVersion'. AnalysisFacts.cs 20

The code checks that the passed langVersion parameter doesn't equal null. So, the developer assumed that null could be passed during the call. Does the check save you from an exception?

Unfortunately, if the langVersion variable equals null, the value of the first part of the expression is false. When the second part is calculated, an exception is thrown.

Generally, to improve readability we need to additionally format the code fragments to post in an article. This case isn't an exception – the expression considered earlier, in fact, was written as one line:

Given the comment, you can easily understand that either the operator precedence is mixed up here, or the bracket is misplaced. The method is most likely to look as follows:

public static bool IsAutoloadDeprecated(Version langVersion)
{
  // >= 7.2
  return    langVersion != null 
         && (   langVersion.Major > 7 
             || langVersion.Major == 7 && langVersion.Minor >= 2);
}

That's it!

Actually, no. The analyzer issued about 5 hundred warnings for the entire project, and there are many curious ones left waiting for the investigation. Therefore, I still suggest you try PVS-Studio and see what else it may find in this or other projects. Who knows, maybe you'll manage to find some bugs that are even more exciting than all warnings that I've sorted out here :). Don't forget to mention the found warnings in the comments. The bugs you found may get into the 2021 Top 10!

Wish you good luck!

#CSharp #Compiler