Recently, we started a new hobby that is also a way to spread the word about our static code analyzer PVS-Studio. We check open-source projects and release patches with fixes. Today I would like to speak about one interesting bug that I found in Entity Framework project.
I have already sent a patch to fix this error. But enough talking. The analyzer issued 2 warnings for one string:
This is actually not a rare case when the analyzer issues 2 or even 3 warnings for one line. The thing is that the incorrect code can be anomalous from several points of view at the same time.
Let us consider the code:
var memberInitExpression = (MemberInitExpression)obj;
....
for (var i = 0; i < memberInitExpression.Bindings.Count; i++)
{
var memberBinding = memberInitExpression.Bindings[i];
....
switch (memberBinding.BindingType)
{
case ....
case MemberBindingType.ListBinding:
var memberListBinding = (MemberListBinding)memberBinding;
for(var j=0; i < memberListBinding.Initializers.Count; i++)
{
hashCode += (hashCode * 397) ^
GetHashCode(memberListBinding.Initializers[j].Arguments);
}
break;
....
}
}
What's going on here? As we can see, we have 2 loops. In the first we see a counter i to iterate the list memberInitExpression.Bindings, in the second - a counter j to iterate the list memberListBinding.Initializers. But for some reason the second loop uses the counter from the first loop. It seemed very suspicious to me, so I decided to write a small unit test to check if it really is an error or just a tricky algorithm of the program.
The code of the unit test:
[ConditionalFact]
public void Compare_member_init_expressions_by_hash_code()
{
MethodInfo addMethod = typeof(List<string>).GetMethod("Add");
MemberListBinding bindingMessages = Expression.ListBind(
typeof(Node).GetProperty("Messages"),
Expression.ElementInit(addMethod, Expression.Constant(
"Greeting from PVS-Studio developers!"))
);
MemberListBinding bindingDescriptions = Expression.ListBind(
typeof(Node).GetProperty("Descriptions"),
Expression.ElementInit(addMethod, Expression.Constant(
"PVS-Studio is a static code analyzer for C, C++ and C#."))
);
Expression query1 = Expression.MemberInit(
Expression.New(typeof(Node)),
new List<MemberBinding>() {
bindingMessages // One member
}
);
Expression query2 = Expression.MemberInit(
Expression.New(typeof(Node)),
new List<MemberBinding>() {
bindingMessages, // Two members
bindingDescriptions
}
);
var comparer = new ExpressionEqualityComparer();
var key1Hash = comparer.GetHashCode(query1);
var key2Hash = comparer.GetHashCode(query2);
// The hash codes for both expressions
// were the same before my edit
Assert.NotEqual(key1Hash, key2Hash); // <=
}
My expectations were confirmed. It is a real error. The thing is that when comparing 2 expressions, only 2 first elements of the collections have always been compared, which led to incorrect results for different expressions with identical first elements. Taking into account that Entity Framework works with expressions very closely, and its main aim is transforming lambdas and Linq requests to the SQL requests, I think it should not be hard to guess what results could have such a serious bug.
According to Common Weakness Enumeration, the bug found can be classified as CWE-670 (Always-Incorrect Control Flow Implementation). It is not clear if this code weakness can be exploited as a vulnerability, but the bug is quite serious. This is a good demonstration that the PVS-Studio analyzer can be used to search for potential vulnerabilities. In fact, it has always been able to do this, we just haven't focused on this aspect of our analyzer. More details on this topic can be found in the article "PVS-Studio: searching security defects".