We have successfully created and continue developing PVS-Studio analyzer for C/C++ languages. Over the time, it became clear that many of the diagnostics that we have implemented are not related to a specific programming language, so we decided to apply our experience to another programming language, namely C#. In this article, we are talking about the analysis of Code Contracts project by Microsoft done by our C# analyzer.
Code Contracts provide a language-agnostic way to express coding assumptions in .NET programs. The contracts take the form of preconditions, postconditions, and object invariants. Contracts act as checked documentation of your external and internal APIs. The contracts are used to improve testing via runtime checking, enable static contract verification, and documentation generation.
It is a medium-sized project (~ 4000 source files), which is actively developing: it contains quite a number of code fragments that are not finished and sometimes incorrectly written. This stage is perfect for implementing a static code analyzer.
We are planning to release the first version of PVS-Studio with C# support on the 22.12.2015. The release number is going to change to 6.0.
The pricing policy will remain unchanged. Previously, PVS-Studio allowed to run the analysis of programs written in C, C++, C++/CLI, C++/CX. Now we have added C# to this list.
Preparing an article on an open source project check, we report only about a certain number of all of the warnings issued by the analyzer, therefore we recommend the authors of the project to run the analyzer on their code themselves and study the complete analysis results.
V3025 Incorrect format. A different number of actual arguments is expected while calling 'Format' function. Expected: 3. Present: 2. VSServiceProvider.cs 515
void AskToReportError(Exception exn) {
....
var emailBody = new StringBuilder();
emailBody.AppendLine("Hi Code Contracts user,");
emailBody.AppendLine();
....
emailBody.AppendLine(
String.Format(".... {0} {1} Visual Studio {2} Bug Report",
typeof(VSServiceProvider).Assembly.GetName().Version,
#if DEBUG
"Debug"
#else
"Release"
#endif
));
....
}
String.Format() function expects 3 arguments, but only 2 arguments were passed. In this case we have FormatException.
V3014 It is likely that a wrong variable is being incremented inside the 'for' operator. Consider reviewing 'i'. SparseArray.cs 1956
override public string ToString()
{
StringBuilder str = new StringBuilder();
for (int i = 0; i < data.Length; i++)
{
if (data[i] != null)
{
for (int j = 0; j < lastElement[i]; i++) // <=
{
str.AppendFormat("({0},{1})", data[i][j].Index,
data[i][j].Value);
}
}
}
return str.ToString();
}
In a nested loop the counter variable 'j' is not changed, because we have modification of outer loop counter 'i++' instead of 'j++'
Couple more similar fragments:
V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 203, 207. WeakestPreconditionProver.csToSMT2.cs 203
private BoxedExpression DeclareVariable(....)
{
var tmp = original.ToString().Replace(' ', '_');
this.Info.AddDeclaration(string.Format("....", tmp, type));
this.ResultValue = tmp;
if (type == FLOAT32) // <=
{
types[original] = FloatType.Float32;
}
else if (type == FLOAT64) // <=
{
types[original] = FloatType.Float64;
}
return original;
}
The analyzer detected two similar conditional expressions, because of which the operators in the second condition will never get control. Although, at first glance, it is not so, we'll move on to the definition of constants FLOAT32 and FLOAT64, and see the following code:
private const string FLOAT32 = "(_ FP 11 53)"; // To change!!!
private const string FLOAT64 = "(_ FP 11 53)";
The constants really are equal! Although we have a commentary here that the FLOAT32 constant value should be replaced, this spot is easy to skip in the future. In developing projects, it is important to tag places as TODO and to regularly review the results of static code analysis.
V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 1200, 1210. OutputPrettyCS.cs 1200
public enum TypeConstraint
{
NONE,
CLASS, // <=
STRUCT, // <=
BASECLASS,
}
public void Output(OutputHelper oh)
{
Contract.Requires(oh != null);
oh.Output("where ", false);
mParent.OutputName(oh);
oh.Output(" : ", false);
//** base class
bool comma = false;
if (mTypeConstraint == TypeConstraint.CLASS) // <=???
{
oh.Output("class", false);
comma = true;
}
else if (mTypeConstraint == TypeConstraint.STRUCT)
{
oh.Output("struct", false);
comma = true;
}
else if (mTypeConstraint == TypeConstraint.CLASS) // <=???
{
oh.Output(mClassConstraint, false);
comma = true;
}
}
In this code fragment the same conditions are more obvious. Most likely in one of the conditions the programmer wanted to compare the 'mTypeConstraint' variable with a constant TypeConstraint.BASECLASS instead of TypeConstraint.CLASS.
V3022 Expression 'c > '\xFFFF'' is always false. Output.cs 685
private static string Encode(string s)
{
....
foreach( char c in s ) {
if (c == splitC || c == '\n' || c == '\\') {
specialCount++;
}
else if (c > '\x7F') {
if (c > '\xFFFF') specialCount += 9;
else specialCount += 5;
}
}
....
}
The expression "c > '\xFFFF'" will never be true and the "specialCount += 9" operator will never be executed. The 'c' variable has a Char type, the maximum value of which is "\xFFFF '. It's not really clear how this code should work and how it should be fixed. Perhaps we have a typo here or it is a code fragment, taken from an application written in a different language. For example, in C/C++ sometimes developers use 32-bit symbols. And "play around" with the 0xFFFF value. Example of such code:
/* putUTF8 -- write a character to stdout in UTF8 encoding */
static void putUTF8(long c)
{
if (c <= 0x7F) { /* Leave ASCII encoded */
printf("%ld;", c);
} else if (c <= 0x07FF) { /* 110xxxxx 10xxxxxx */
putchar(0xC0 | (c >> 6));
putchar(0x80 | (c & 0x3F));
} else if (c <= 0xFFFF) { /* 1110xxxx + 2 */
putchar(0xE0 | (c >> 12));
putchar(0x80 | ((c >> 6) & 0x3F));
putchar(0x80 | (c & 0x3F));
} else if (c <= 0x1FFFFF) { /* 11110xxx + 3 */
putchar(0xF0 | (c >> 18));
putchar(0x80 | ((c >> 12) & 0x3F));
putchar(0x80 | ((c >> 6) & 0x3F));
putchar(0x80 | (c & 0x3F));
} else if (c <= 0x3FFFFFF) { /* 111110xx + 4 */
putchar(0xF8 | (c >> 24));
putchar(0x80 | ((c >> 18) & 0x3F));
putchar(0x80 | ((c >> 12) & 0x3F));
putchar(0x80 | ((c >> 6) & 0x3F));
putchar(0x80 | (c & 0x3F));
} else if (c <= 0x7FFFFFFF) { /* 1111110x + 5 */
putchar(0xFC | (c >> 30));
putchar(0x80 | ((c >> 24) & 0x3F));
putchar(0x80 | ((c >> 18) & 0x3F));
putchar(0x80 | ((c >> 12) & 0x3F));
putchar(0x80 | ((c >> 6) & 0x3F));
putchar(0x80 | (c & 0x3F));
} else { /* Not a valid character... */
printf("%ld;", c);
}
}
V3008 The 'this.InsideMonitor' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 751, 749. AssertionCrawlerAnalysis.cs 751
private Data(Data state, Variable v)
{
this.IsReached = state.IsReached;
this.InsideMonitor = state.InsideMonitor; // <=
this.symbols = new List<Variable>(state.symbols) { v };
this.InsideMonitor = false; // <=???
}
It's very suspicious that some function changes the state of an object using the values passed through the function parameters and at the last moment replaces "this.InsideMonitor" field value with a 'false' constant. Previously the assignment "this.InsideMonitor = state.InsideMonitor" has already been executed.
V3009 It's odd that this method always returns one and the same value of 'true'. LinearEqualities.cs 5262
public bool TryGetFirstAvailableDimension(out int dim)
{
for (var i = 0; i < map.Length; i++)
{
if (!map[i])
{
dim = i;
return true;
}
}
map.Length++;
dim = map.Length;
return true;
}
The analyzer detected a function that always returns the same "true" value. We can assume that when the "!map[i]" condition is executed the function should return a certain kind of value, but if this condition has never been true, then it should return a different value. Perhaps, there is an error.
V3025 Incorrect format. A different number of actual arguments is expected while calling 'Format' function. Expected: 1. Present: 2. Output.cs 68
public override void WriteLine(string value)
{
output.WriteLine(string.Format("{1}", DateTime.Now,
value.Replace("{", "{{").Replace("}","}}")));
//output.WriteLine(string.Format("[{0}] {1}",
//DateTime.Now., value));
}
Earlier the String.Format() function received and printed 2 values: date and some value. Then this code was commented out and another variant was written, where the argument with the 0 index is not used, so the date is not printed.
Other examples of formatting functions calls with unused arguments:
V3004 The 'then' statement is equivalent to the 'else' statement. Metadata.cs 2602
private void SerializeFieldRvaTable(....)
{
....
switch (row.TargetSection){
case PESection.SData:
case PESection.TLS:
Fixup fixup = new Fixup();
fixup.fixupLocation = writer.BaseStream.Position;
fixup.addressOfNextInstruction = row.RVA;
if (row.TargetSection == PESection.SData){
sdataFixup.nextFixUp = fixup; // <=
sdataFixup = fixup; // <=
}else{
sdataFixup.nextFixUp = fixup; // <=
sdataFixup = fixup; // <=
}
writer.Write((int)0);
break;
....
}
The analyzer detected identical blocks of code in a conditional operator. This may be an unnecessary code fragment or code block was not changed after copying. Copy-Paste does not have mercy on C# programmers.
A full list of similar fragments:
V3001 There are identical sub-expressions 'semanticType.Name == null' to the left and to the right of the '||' operator. ContractsProvider.cs 694
public bool TryGetTypeReference(....)
{
....
if (semanticType.Name == null || semanticType.Name == null)
goto ReturnFalse;
cciType = new Microsoft.Cci.MutableCodeModel.NamespaceTypeR....
{
ContainingUnitNamespace = cciNamespace,
GenericParameterCount = (ushort) (....),
InternFactory = Host.InternFactory,
IsValueType = semanticType.IsValueType,
IsEnum = semanticType.TypeKind == TypeKind.Enum,
Name = Host.NameTable.GetNameFor(semanticType.Name),
TypeCode=CSharpToCCIHelper.GetPrimitiveTypeCode(semanticType)
};
goto ReturnTrue;'
....
}
The condition "semanticType.Name == null" is checked 2 times. There two options here - this check is either redundant and can be simplified; or another object field was not checked.
Another warning on this kind:
V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'other', 'right'. CallerInvariant.cs 189
public override Predicate JoinWith(Predicate other)
{
var right = other as PredicateNullness;
if (other != null)
{
if (this.value == right.value)
{
return this;
}
}
return PredicateTop.Value;
}
The analyzer detected a potential error, which can lead to access by null reference. It is necessary to compare the result of the 'as' operator execution with 'null'
If you encounter a situation, when the 'other' object is not null, but it's impossible to cast it to the 'PredicateNullness' type, then we have an access by null reference when getting the "right.value".
There is quite a number of such comparisons in the project. Here is the full list:
V3030 Recurring check. The 'this.lineOffsets == null' condition was already verified in line 612. Nodes.cs 613
public virtual void InsertOrDeleteLines(....)
{
....
if (this.lineOffsets == null)
if (this.lineOffsets == null) this.ComputeLineOffsets();
if (lineCount < 0)
this.DeleteLines(offset, -lineCount);
else
this.InsertLines(offset, lineCount);
....
}
Two identical "this.lineOffsets == null" checks, written one after another. This code has no sense. Probably the programmer intended to check something else.
V3002 The switch statement does not cover all values of the 'UnaryOperator' enum: Conv_dec. WeakestPreconditionProver.csToSMT2.cs 453
private string Combine(UnaryOperator unaryOperator, string arg)
{
Contract.Requires(arg != null);
var format = "({0} {1})";
string op = null;
switch (unaryOperator)
{
case UnaryOperator.Neg:
case UnaryOperator.Not:
case UnaryOperator.Not:
{
op = "not";
}
break;
case UnaryOperator.WritableBytes:
case UnaryOperator.Conv_i:
case UnaryOperator.Conv_i1:
case UnaryOperator.Conv_i2:
case UnaryOperator.Conv_i4:
case UnaryOperator.Conv_i8:
case UnaryOperator.Conv_r_un:
case UnaryOperator.Conv_r4:
case UnaryOperator.Conv_r8:
case UnaryOperator.Conv_u:
case UnaryOperator.Conv_u1:
case UnaryOperator.Conv_u2:
case UnaryOperator.Conv_u4:
case UnaryOperator.Conv_u8:
{
return null;
}
}
return string.Format(format, op, arg);
}
The analyzer detected a 'switch' operator, where the choice of the variant is made via the enum type variable. At the same time one element "UnaryOperator Conv_dec" was omitted in the 'switch' operator. It is very suspicious.
Below is the definition of "UnaryOperator" enumeration:
public enum UnaryOperator
{
....
Conv_u8,
Conv_r_un,
Neg,
Not,
WritableBytes,
Conv_dec, // <=
}
A possible error is that this function is implemented in such a way that it returns a formatted string for the "UnaryOperator.Not" value and in all other cases it returns 'null' value. But as the "UnaryOperator. Conv_dec" is missing, then the 'op' variable value is 'null' and it will get to the formatted string that the function will return.
We hope you enjoyed this article. In the future there will be more articles about the checks of the projects.
As it was mentioned before, PVS-Studio C# release is on the 22.12.2015. Usually at the end of the year, people make decisions about the future purchases. That's why to all who are interested we offer to contact us without hesitations about the purchase of PVS-Studio. We'll be very glad to see you among our customers.
Thank you for your attention. We wish you bugless code!