Welcome all fans of clean code! Today we analyze the PascalABC.NET project. In 2017, we already found errors in this project. We used two static analysis tools (more precisely, plugins for SonarQube): SonarC# and PVS-Studio. Today, we analyze this project with the latest version of the PVS-Studio analyzer for C#. Let's see what errors we can find today, especially when our analyzer has become more advanced and got new features: it can find more exquisite errors and potential vulnerabilities.
I have an interesting story about PascalABC.NET. Right after we published "Analysis of PascalABC.NET using SonarQube plugins: SonarC# and PVS-Studio", we accidentally crossed paths with the developers at one conference. It looked like we did it on purpose: wrote an article about errors found in their project and went to the conference to discuss those errors with the developers. Of course, we never planned that, it was a coincidence. But it was funny. After that I was considering the idea of re-checking the project, but I didn't have time for that. Now the time has come.
PascalABC.NET is a modern implementation of the Pascal language on .NET. You can visit the project's website to read the description and see that the project is developing. The latest version 3.8.1 was released in August 2021. Good news — it's no point in re-checking the "abandoned" project. This was an additional motivation to write this article. A developing project means that old errors are fixed, and the new ones appear.
For analysis, I took the source code from GitHub from 10.12.2021. Note that while I was writing the article the code may have changed. Please take this fact into account if you're going to check the source of PascalABC.NET yourself. By the way, you can easily request the PVS-Studio trial version. Don't forget about our new feature "Best Warnings" which immediately shows the most interesting errors. This is important when you work with such large projects.
Unfortunately, many errors found in 2017 were never fixed. After publishing an article, we always send bug reports to the developers. However, only developers can fix those errors. This was an additional problem, since I had to exclude old errors from the report. Despite this, when re-checking the project, we managed to find some new and interesting errors. You can see them below.
Let's start with classic — copy-paste errors. Unbelievable, but developers make such errors over and over again. This means PVS-Studio will definitely have a job to do. Besides, such errors show an important advantage of static analysis tools: constant attention to detail. People don't always have it due to fatigue and other reasons.
V3001 There are identical sub-expressions to the left and to the right of the '||' operator. NETGenerator.cs 461
public class CompilerOptions
{
public enum PlatformTarget { x64, x86, AnyCPU,
dotnet5win, dotnet5linux, dotnet5macos };
....
}
....
bool IsDotnet5()
{
return
comp_opt.platformtarget ==
CompilerOptions.PlatformTarget.dotnet5win ||
comp_opt.platformtarget ==
CompilerOptions.PlatformTarget.dotnet5linux ||
comp_opt.platformtarget ==
CompilerOptions.PlatformTarget.dotnet5linux;
}
In this code fragment the developer re-compares the IsDotnet5() method with the value of enumeration CompilerOptions.PlatformTarget.dotnet5linux. If we look at the declaration of the PlatformTarget enumeration, we can assume that the code should look like this:
bool IsDotnet5()
{
return
comp_opt.platformtarget ==
CompilerOptions.PlatformTarget.dotnet5win ||
comp_opt.platformtarget ==
CompilerOptions.PlatformTarget.dotnet5linux ||
comp_opt.platformtarget ==
CompilerOptions.PlatformTarget.dotnet5macos;
}
Note that the code was formatted for readability. In the original version the entire return expression is written in one line.
V3001 There are identical sub-expressions 'ctn2.compiled_type == TypeFactory.ObjectType' to the left and to the right of the '||' operator. NETGenerator.cs 8518
private void AssignToDereferenceNode(....)
{
....
if (.... && (ctn2.compiled_type == TypeFactory.ObjectType ||
(ctn2.compiled_type == TypeFactory.ObjectType ||
ctn2.compiled_type.IsInterface)))
....
}
Here the developer compares the same value with the TypeFactory.ObjectType value. The code was formatted once again. In the original version the if expression was written in one line. I think it's quite difficult for a person to notice problems in such code. It's hard to say how to fix this error, since the TypeFactory class has a lot of fields.
V3001 There are identical sub-expressions 'SK == SymKind.field' to the left and to the right of the '||' operator. LightScopeHelperClasses.cs 30
public enum SymKind { var, field, param, procname, funcname,
classname, recordname, interfacename };
....
public class SymInfoSyntax
{
public override string ToString()
{
....
if (SK == SymKind.var ||
SK == SymKind.field ||
SK == SymKind.field ||
SK == SymKind.param)
....
}
....
}
One of the comparisons SK == SymKind.field has a mistake in it. It should contain a different value of the SymKind enumeration. Maybe the developer who wrote this code fragment could explain what's going on.
V3004 [CWE-691] The 'then' statement is equivalent to the 'else' statement. SymbolTable.cs 870
private Scope FindClassScope(Scope scope)
{
while (scope != null && !(scope is ClassScope))
if(scope is ClassMethodScope)
scope = scope.TopScope;
else
scope = scope.TopScope;
return scope;
}
Different error pattern, same copy-paste: both code blocks of the if operator are identical. Here we also need the developer to inspect and fix this error.
V3005 The 'e' variable is assigned to itself. generics.cs 430
public static type_node determine_type(....)
{
....
try
{
return ....;
}
catch(Exception e)
{
e = e;
}
....
}
A bit weird code. It could be a copy-paste error, as well as an attempt to suppress a warning about an unused variable. Or it could be the consequence of refactoring. Perhaps earlier there was some external e variable relative to the catch block, and it was then deleted. Anyway, the code looks sloppy.
Besides copy-paste errors, I found other problems in the PascalABC.NET code.
V3022 [CWE-570] Expression 't != null' is always false. Visitor.cs 598
public void prepare_collection(....)
{
myTreeNode t;
....
if (t == null)
{
....
if (t != null)
t.Nodes.Add(tn);
else
nodes.Add(tn);
....
}
....
}
Did this happen after refactoring? Was the developer overly cautious or simply inattentive? As a result, the then branch of t.Nodes.Add(tn) in the if block is never executed. The code needs to be fixed.
V3027 [CWE-476] The variable 'fn.return_value_type' was utilized in the logical expression before it was verified against null in the same logical expression. NetHelper.cs 1109
private static function_node get_conversion(....)
{
....
function_node fn = si.sym_info as function_node;
if (.... || fn.return_value_type.original_generic == to || ....
&& fn.return_value_type != null && ....)
{
return fn;
}
....
}
The fn.return_value_type variable is dereferenced without null check. The author supposed that the variable could be null because it's checked explicitly.
V3032 [CWE-835] Waiting on this expression is unreliable, as compiler may optimize some of the variables. Use volatile variable(s) or synchronization primitives to avoid this. RemoteCompiler.cs 407
CompilerState compilerState = CompilerState.Reloading;
....
public string Compile()
{
....
compilerState = CompilerState.CompilationStarting;
....
while (compilerState != CompilerState.Ready)
Thread.Sleep(5);
....
}
An interesting error related to the compiler features. The problem may show itself in the release version: due to optimizations the while loop will be infinite. The peculiarities of this error and the fixing options are described in the V3032 documentation.
V3043 [CWE-483] The code's operational logic does not correspond with its formatting. The statement is indented to the right, but it is always executed. It is possible that curly brackets are missing. Compiler.cs 2196
public string Compile()
{
....
int n = 1;
try
{
n = 2;
....
if (File.Exists(pdb_file_name))
File.Delete(pdb_file_name);
n = 5;
....
}
....
}
It may seem that expression n = 5 relates to the if block, but it's not. The code was badly formatted. This warning is just an example. A rare mistake that doesn't lead to error in this case. But this isn't always like that. There is a section on our website with a list of errors found in projects. This list has errors found with V3043 among many others. One of the V3043 errors listed there is from the PascalABC.NET project. I described it when I first checked the project in 2017. This error is similar to other errors, but it's more dangerous. You can click on the link and look at this error. Just scroll down a little to get to PascalABC.NET.
Before proceeding to the next error, I suggest you look at the code fragment and find an error yourself:
public static typed_expression
GetTempFunctionNodeForTypeInference(....)
{
....
for (int i = 0; i < def.formal_parameters.params_list.Count; i++)
{
....
for (int j = 0;
j < def.formal_parameters.params_list[i].idents.idents.Count;
j++)
{
var new_param = new common_parameter(....,
visitor.get_location(
def.formal_parameters.params_list[i].idents.idents[0]));
....
}
}
....
}
Have you found it? To be honest, even with the analyzer warning I didn't immediately understand the problem. And yes, the code was formatted for readability. The original version was less readable. Here's the analyzer's warning: V3102 Suspicious access to element of 'def.formal_parameters.params_list[i].idents.idents' object by a constant index inside a loop. LambdaHelper.cs 402
Look closely at the calculation of the new_param variable's value. All iterations of the nested loop use access to the zero element of list def.formal_parameters.params_list[i].idents.idents[0]. Everything points out that the j index should have been used instead of 0.
Below is the last error I wanted to show you.
V3146 [CWE-476] Possible null dereference. The 'symbolInfo.FirstOrDefault()' can return default null value. SystemLibInitializer.cs 112
public class SymbolInfo
{
....
}
....
List<TreeConverter.SymbolInfo> symbolInfo = null;
....
public List<TreeConverter.SymbolInfo> SymbolInfo
{
get
{
if (symbolInfo != null && ....)
{
if (symbolInfo.FirstOrDefault().sym_info is common_type_node)
....
}
}
}
Look at the condition of the second if block. The symbolInfo reference was checked for null earlier, no questions here. However, the developers forgot that the FirstOrDefault() method may return the default value (null) for the SymbolInfo type if the symbolInfo list does not contain any element. This will cause problems when we access the sym_info property by a null reference.
This is a small article. But this doesn't mean that PascalABC.NET has few errors. I described the most of those errors in 2017, but the developers never fixed them. After the last check the analyzer issued 400 warnings on the High level. On the Medium level — 1364 warnings. There are many same-type errors among them, so I see no point in describing them. The readers can see it themselves if they decide to check the PascalABC.NET project with PVS-Studio and search for errors I described in this and previous articles.
In fact, late bug fixes in open-source code are a common problem. My teammate Andrey Karpov even wrote an article about that: "1000 eyes that don't want to check open-source code".
I must also note that during the analysis I realized that the use of the analyzer can be inefficient and inconvenient from time to time. Indeed, it's really difficult to search for real errors among thousands of warnings. Besides, old errors aren't fixed, and the analyzer warnings are not suppressed. I don't think developers would want to do such hard work. I understand them.
In our opinion, the point of the static analyzer is in regular checks. The code must be checked right after it was written. If the analyzer finds errors in the code, they must be immediately fixed.
Let me remind you that modern static analyzers, including PVS-Studio, have a lot of opportunities for convenient work with large projects. Especially at the implementation stage with a large codebase. In this case, we recommend using suppression of all old warnings and working only with those issued for the new code (incremental analysis). Old errors can be corrected little by little, and they won't be displayed in the analyzer report. You can read about these features in articles "Baselining analysis results (suppressing warnings for existing code)" and "Incremental analysis mode in PVS-Studio".
Now I finish this article and wish you all clean code. Good luck.