It has become a "good tradition" for Microsoft to make their products open-source: CoreFX, .Net Compiler Platform (Roslyn), Code Contracts, MSBuild, and other projects. For us, the developers of PVS-Studio analyzer, it's an opportunity to check well-known projects, tell people (including the project authors themselves) about the bugs we find, and additionally test our analyzer. Today we are going to talk about the errors found in another project by Microsoft, PowerShell.
PowerShell is a cross-platform project by Microsoft consisting of a command-line shell and associated scripting language built on the Microsoft .NET Framework and integrated with it. PowerShell also provides convenient access to COM, WMI, and ADSI, and enables administrators to perform various tasks in a single environment both on local and remote Windows systems by running regular command-line commands.
The project code can be downloaded from the GitHub repository.
According to the statistics of the project repository, 93% of the code is written in C#.
The project was analyzed with PVS-Studio static code analyzer. The version we were using is currently in the development process, so it is newer than PVS-Studio 6.08 but it's not PVS-Studio 6.09 either. This approach enables us to put the new version through more extensive testing and fix possible defects. It does not replace the multilevel system of tests, of course (read about the seven testing techniques in the article discussing the development of the Linux-version), but rather is another way of testing the tool.
The up-to-date version of the analyzer can be downloaded here.
I updated the analyzer and downloaded the project's code, so everything was ready to go. Well, sometimes things get complicated as early as at the stage of preparing a project for analysis, i.e. at the building stage. It is recommended that you build projects before analyzing them. Why does it matter? The analyzer will have access to more information that way, so it will be able to provide a deeper analysis.
The most common (and convenient) way of using PVS-Studio is to run it from the Visual Studio IDE. It's quick, easy, and convenient. For PowerShell, however, it's a problem.
It turned out that the authors themselves didn't recommend using Visual Studio to build the project. They say it straightforward on GitHub: "We do not recommend building the PowerShell solution from Visual Studio."
Well, I couldn't resist the temptation to build and check it in Visual Studio, so I gave it a try anyway. This is what I got:
Figure 1. Project compilation errors (click to enlarge) when analyzing PowerShell from Visual Studio.
Well, that's sad. What did it mean in my situation? That I wouldn't be able to test all the features of the analyzer on this project. Then you have two scenarios.
Scenario 1. Check the project without building it.
A project would't build? OK, let's check it as it is.
What are the pros of this approach? You don't have to waste time figuring out the problem and trying various tricks to get the project built. It does help you save time; moreover, it is not guaranteed that your tricks will work out after all.
The cons of this approach are also clear. First, the analysis will be incomplete; some bugs will slip from the analyzer. You may also get a certain number of false positives. Second, it makes the estimate of the false/genuine warnings ratio pointless, as it may vary greatly for the built version.
However, even this scenario allows you to find a decent number of errors and write an article.
Scenario 2. Figure it all out and get the project built.
The pros and cons of this approach are opposite to those of the previous one. Yes, you'll have to spend more time on building, but it's not guaranteed that it will work out. If you do succeed, however, you'll be able to analyze the code more thoroughly and maybe find some interesting bugs.
There's no definite suggestion about what way to choose; everyone decides for themselves.
I struggled with the project for a while, trying to build it, and finally decided to go "as is". This approach was good enough for my goal to write an article.
Note. While it can't be built from Visual Studio, the project can be easily built by using the script (build.sh) located in the root directory.
Note 2. One of the developers (many thanks to him) told me that the *.sln-file was intended to make it more comfortable to work with the project, but it wasn't meant to be used for building, which is just another argument for choosing the first approach.
Duplicate subexpressions
Projects that trigger no V3001 warnings do deserve a medal. PowerShell, unfortunately, wouldn't get it, and here's why:
internal Version BaseMinimumVersion { get; set; }
internal Version BaseMaximumVersion { get; set; }
protected override void ProcessRecord()
{
if (BaseMaximumVersion != null &&
BaseMaximumVersion != null &&
BaseMaximumVersion < BaseMinimumVersion)
{
string message = StringUtil.Format(
Modules.MinimumVersionAndMaximumVersionInvalidRange,
BaseMinimumVersion,
BaseMaximumVersion);
throw new PSArgumentOutOfRangeException(message);
}
....
}
PVS-Studio warning: V3001 There are identical sub-expressions 'BaseMaximumVersion != null' to the left and to the right of the '&&' operator. System.Management.Automation ImportModuleCommand.cs 1663
The BaseMaximumVersion reference is tested for null twice, but it's obviously the BaseMinimumVersion reference that should be checked in the second case. If you are lucky, the program may run for a long time without this error ever showing up, but when it does occur, the information about BaseMinimumVersion will never be included in the error message formed when the exception is thrown, as the BaseMinimumVersion reference will be null. As a result, some portion of useful information will be lost.
Note that I fixed the code formatting in this example to make the error easier to notice. In the original code, however, the whole condition is written in one line, which is another example of why good code formatting is so important: not only does it make the code easier to read and understand, but it also makes errors easier to see.
internal static class RemoteDataNameStrings
{
....
internal const string MinRunspaces = "MinRunspaces";
internal const string MaxRunspaces = "MaxRunspaces";
....
}
internal void ExecuteConnect(....)
{
....
if
(
connectRunspacePoolObject.Data
.Properties[RemoteDataNameStrings.MinRunspaces] != null
&&
connectRunspacePoolObject.Data
.Properties[RemoteDataNameStrings.MinRunspaces] != null
)
{
try
{
clientRequestedMinRunspaces = RemotingDecoder.GetMinRunspaces(
connectRunspacePoolObject.Data);
clientRequestedMaxRunspaces = RemotingDecoder.GetMaxRunspaces(
connectRunspacePoolObject.Data);
clientRequestedRunspaceCount = true;
}
....
}
....
}
PVS-Studio warning: V3001 There are identical sub-expressions to the left and to the right of the '&&' operator. System.Management.Automation serverremotesession.cs 633
Again, there is a typo that causes one check to execute twice. What should be checked in the second case is most likely the constant field MaxRunspaces of the static class RemoteDataNameStrings.
Unused return value
There are errors that have to do with unused method return values. The reasons, as well as implications, vary a lot. Sometimes programmers forget that objects of type String are immutable and that string- modifying methods return a new string rather than changing the existing one. In the same way, using LINQ yields a new collection. Errors of this type were also found in PowerShell.
private CatchClauseAst CatchBlockRule(....
ref List<TypeConstraintAst> errorAsts)
{
....
if (errorAsts == null)
{
errorAsts = exceptionTypes;
}
else
{
errorAsts.Concat(exceptionTypes); // <=
}
....
}
PVS-Studio warning: V3010 The return value of function 'Concat' is required to be utilized. System.Management.Automation Parser.cs 4973
Note that the errorAsts parameter is used with the ref keyword, which implies that the reference gets changed in the method body. The logic of this code is simple: if the errorAsts reference is null, then it is assigned with a reference to another collection; otherwise, the elements of the exceptionTypes collection are added to the existing one. However, the second part doesn't work properly. The Concat method returns a new collection without modifying the existing one, so the errorAsts collection will remain unchanged, while the new one (containing the elements errorAsts and exceptionTypes) will be ignored.
There are two ways to fix this defect:
Checking a wrong reference after using the 'as' operator
The gold medal goes to the V3019 diagnostic rule! I'm not sure about all projects, but almost every C#-project that I checked and discussed in my articles had this bug. Our long-time readers must have learned this rule by heart: when casting a reference to another type by using the as operator, always make sure that you test the resulting reference, not the original one, for null.
internal List<Job> GetJobsForComputer(String computerName)
{
....
foreach (Job j in ChildJobs)
{
PSRemotingChildJob child = j as PSRemotingChildJob;
if (j == null) continue;
if (String.Equals(child.Runspace
.ConnectionInfo
.ComputerName,
computerName,
StringComparison.OrdinalIgnoreCase))
{
returnJobList.Add(child);
}
}
return returnJobList;
}
PVS-Studio warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'j', 'child'. System.Management.Automation Job.cs 1876
The result of casting j to the PSRemotingChildJob type is written to the child reference, which means that this reference may be assigned with the null value (if the original reference is null or if the cast failed). The programmer, however, checks the original reference, j, and then attempts to access the Runspace property of the child object. So, if j != null and child == null, the j == null check won't help and you'll get a NullReferenceException when accessing the instance members of the resulting reference.
Two more defects of this type:
Incorrect operation order
private void CopyFileFromRemoteSession(....)
{
....
ArrayList remoteFileStreams =
GetRemoteSourceAlternateStreams(ps, sourceFileFullName);
if ((remoteFileStreams.Count > 0) && (remoteFileStreams != null))
....
}
PVS-Studio warning: V3027 The variable 'remoteFileStreams' was utilized in the logical expression before it was verified against null in the same logical expression. System.Management.Automation FileSystemProvider.cs 4126
If you are lucky, the code will execute successfully; if not, you'll get a NullReferenceException when attempting to dereference a null reference. The remoteFileStreams != null subexpression doesn't actually do anything, nor does it protect the code from the exception. Obviously, you need to swap the subexpressions to make the code work properly.
Well, we are all humans, and we all make mistakes, and static analyzers are the tools whose purpose is to catch our mistakes.
Potential null dereference
internal bool SafeForExport()
{
return DisplayEntry.SafeForExport() &&
ItemSelectionCondition == null
|| ItemSelectionCondition.SafeForExport();
}
PVS-Studio warning: V3080 Possible null dereference. Consider inspecting 'ItemSelectionCondition'. System.Management.Automation displayDescriptionData_List.cs 352
There is a risk of getting a NullReferenceException when executing this code. The ItemSelectionCondition.SafeForExport() subexpression will be evaluated only if the first subexpression evaluates to false. Therefore, if DisplayEntry.SafeForExport() returns false and ItemSelectionCondition == null, the second subexpression, ItemSelectionCondition.SafeForExport(), will be evaluated, and that's where the null dereference will occur (and raise the exception).
I found another similar code fragment in this project. The corresponding message: V3080 Possible null dereference. Consider inspecting 'EntrySelectedBy'. System.Management.Automation displayDescriptionData_Wide.cs 247
Another example.
internal Collection<ProviderInfo> GetProvider(
PSSnapinQualifiedName providerName)
{
....
if (providerName == null)
{
ProviderNotFoundException e =
new ProviderNotFoundException(
providerName.ToString(),
SessionStateCategory.CmdletProvider,
"ProviderNotFound",
SessionStateStrings.ProviderNotFound);
throw e;
}
....
}
PVS-Studio warning: V3080 Possible null dereference. Consider inspecting 'providerName'. System.Management.Automation SessionStateProviderAPIs.cs 1004
Every now and then, you stumble upon code like that. The programmer intended an exception to be of one type, but it ended up being of another type. Why does it happen? In our example, the programmer tests the providerName reference for null, but later, when forming an exception object, they call to the instance method ToString of the same reference. It will result in forming a NullReferenceException instead of the intended ProviderNotFoundException.
There was another similar fragment: V3080 Possible null dereference. Consider inspecting 'job'. System.Management.Automation PowerShellETWTracer.cs 1088
Using a reference before testing it for null
internal ComplexViewEntry GenerateView(....)
{
_complexSpecificParameters =
(ComplexSpecificParameters)inputParameters.shapeParameters;
int maxDepth = _complexSpecificParameters.maxDepth;
....
if (inputParameters != null)
mshParameterList = inputParameters.mshParameterList;
....
}
PVS-Studio warning: V3095 The 'inputParameters' object was used before it was verified against null. Check lines: 430, 436. System.Management.Automation FormatViewGenerator_Complex.cs 430
The inputParameters != null check implies that the reference being checked may be null. The programmer wanted to play safe to make sure they wouldn't get a NullReferenceException when accessing the mshParameterList field. This is a right decision, except that they already accessed another instance field of the same object, shapeParameters, earlier. Since inputParameters doesn't change between these two operations, the null check won't help if the reference has been null from the beginning.
Another similar case:
public CommandMetadata(CommandMetadata other)
{
....
_parameters = new Dictionary<string, ParameterMetadata>(
other.Parameters.Count, StringComparer.OrdinalIgnoreCase);
// deep copy
if (other.Parameters != null)
....
}
PVS-Studio warning: V3095 The 'other.Parameters' object was used before it was verified against null. Check lines: 189, 192. System.Management.Automation CommandMetadata.cs 189
The programmer is testing the Parameters property of the other object for null, but they already accessed the instance property Count a couple of lines earlier. Something is obviously wrong here.
Unused constructor parameter
It's nice to see new diagnostic rules show real results right after we add them to the tool. V3117 is one such diagnostic.
private void PopulateProperties(
Exception exception,
object targetObject,
string fullyQualifiedErrorId,
ErrorCategory errorCategory,
string errorCategory_Activity,
string errorCategory_Reason,
string errorCategory_TargetName,
string errorCategory_TargetType,
string errorCategory_Message,
string errorDetails_Message,
string errorDetails_RecommendedAction,
string errorDetails_ScriptStackTrace)
{ .... }
internal ErrorRecord(
Exception exception,
object targetObject,
string fullyQualifiedErrorId,
ErrorCategory errorCategory,
string errorCategory_Activity,
string errorCategory_Reason,
string errorCategory_TargetName,
string errorCategory_TargetType,
string errorCategory_Message,
string errorDetails_Message,
string errorDetails_RecommendedAction)
{
PopulateProperties(
exception, targetObject, fullyQualifiedErrorId,
errorCategory, errorCategory_Activity,
errorCategory_Reason, errorCategory_TargetName,
errorCategory_TargetType, errorDetails_Message,
errorDetails_Message, errorDetails_RecommendedAction,
null);
}
PVS-Studio warning: V3117 Constructor parameter 'errorCategory_Message' is not used. System.Management.Automation ErrorPackage.cs 1125
Method PopulateProperties is called in the ErrorRecord constructor to initialize the fields and perform some other operations. The analyzer warns us that one of the constructor's parameters, errorCategory_Message, is not used. Indeed, the errorDetails_Message argument is passed twice when calling to the PopulateProperties method, while errorCategory_Message is not passed at all. Checking out the parameter list of PopulateProperties confirms that we are dealing with an error.
An always false condition
One of PVS-Studio's features that help us implement complex diagnostic rules and find complicated bugs is the so called virtual values, which allow the analyzer to track the possible ranges of values that a variable can take at a particular time of execution. For more information on that feature, see the article Searching for errors by means of virtual values evaluation. This mechanism underlies such diagnostics as V3022 and V3063, which often help us discover interesting errors. One such error was found in this project too:
public enum RunspacePoolState
{
BeforeOpen = 0,
Opening = 1,
Opened = 2,
Closed = 3,
Closing = 4,
Broken = 5,
Disconnecting = 6,
Disconnected = 7,
Connecting = 8,
}
internal virtual int GetAvailableRunspaces()
{
....
if (stateInfo.State == RunspacePoolState.Opened)
{
....
return (pool.Count + unUsedCapacity);
}
else if (stateInfo.State != RunspacePoolState.BeforeOpen &&
stateInfo.State != RunspacePoolState.Opening)
{
throw new InvalidOperationException(
HostInterfaceExceptionsStrings.RunspacePoolNotOpened);
}
else if (stateInfo.State == RunspacePoolState.Disconnected)
{
throw new InvalidOperationException(
RunspacePoolStrings.CannotWhileDisconnected);
}
else
{
return maxPoolSz;
}
....
}
PVS-Studio warning: V3022 Expression 'stateInfo.State == RunspacePoolState.Disconnected' is always false. System.Management.Automation RunspacePoolInternal.cs 581
The analyzer insists that the stateInfo.State == RunspacePoolState.Disconnected expression is always false. Is it really so? Sure! I wouldn't cite this example if it were otherwise.
The programmer made a mistake in the preceding condition: if stateInfo.State == RunspacePoolState.Disconnected, then the previous if statement will execute all the time. To fix the error, you just need to swap the last two if (else if) statements.
Yes, there are lots of other suspicious fragments. Our regular readers know that we don't usually discuss all of the errors found. As for this project, there are probably not so many bugs left to make this article as big as the one about the check of Mono, but there is still some material that could be included. It's the project authors who ought to be most interested in a complete list of warnings; to all the rest, I just show the most interesting errors and defects.
Oddly enough, people still ask us this question from time to time. We always inform the developers about the bugs we find, but this time I decided to go a little further.
I talked to one of the developers (Sergey, hi!) personally via Gitter. The advantages of such a solution are obvious - we may discuss the bugs found, get feedback on the analyzer, there might be something to correct in the article. It's great when people understand the usefulness of static analysis. The developers told us that the detected code fragments are bugs indeed, thanked a lot and said that they would fix the bugs over the time. In turn, I decided to help them by giving links to these code fragments in the repository. We also had a talk about the use of the analyzer. It's great, when people understand that static analysis should be used regularly. I hope it will be so, and the analyzer will embedded into the development process.
It was a nice mutually beneficial cooperation.
As I'd expected, the analyzer managed to find quite a lot of suspicious fragments in PowerShell. The point of this article, however, is not about people writing incorrect code or lacking skill (it does happen at times, of course, but obviously not in this case); it's just that it is the human error which is to blame. It's the essence of human - everyone makes mistakes. Static analysis tools are designed to make up for this flaw of ours by catching errors in program code. That's why regular use of such tools is the path to better code. A picture is worth a thousand words, so welcome to try PVS-Studio with your own code.