Webinar: Evaluation - 05.12
Entity Framework Core is developed by professionals who avoid mistakes thanks to their experience, code reviews, and a powerful testing system. However, even such a project is not without its gags. You are about to read an article on the bizarre snippets of code that top developers have overlooked.
Usually we write articles in which we analyze different bugs that we have found in open-source projects with the help of PVS-Studio. However, this is not such a case. The code in the project is really well written :). So, when I said about developer's professionalism, it wasn't sarcasm.
This is a case where I've actually checked the project, but I can't write a classic article about analyzing bugs. There are some curious anomalies that I'll tell you about. Overall, however, the code is of high quality and looks very clean. The analyzer issues almost no warnings that indicate obvious issues.
I used the PVS-Studio 7.26 analyzer to search for oddities and potential errors. The version of Entity Framework Core is 7.0.10. The top is entirely subjective – if you disagree, feel free to comment, and we'll discuss.
protected virtual ModelBuilder VisitDatabaseModel(....)
{
....
if (!string.IsNullOrEmpty(databaseModel.DatabaseName)) // <=
{
modelBuilder.Model.SetDatabaseName(
!_options.UseDatabaseNames
&& !string.IsNullOrEmpty(databaseModel.DatabaseName) // <=
? _candidateNamingService.GenerateCandidateIdentifier(....)
: databaseModel.DatabaseName
);
}
....
}
The first issue from the top immediately reflects the idea behind most of the weirdness I found in Entity Framework Core: there is a surprising amount of inconsistency in the code, especially between fragments that are literally right next to each other.
First, the code is checked to see whether databaseModel.databaseName is an empty string or null. If there is a perfectly normal string value, the code decides to check the property value again. Why? What was the developer trying to do?
The strangest thing is that these checks are only 3 lines apart! How's that?
In the example I gave, the gap is 4 lines (oh yes, that makes a big difference :D). However, this is only because the code had to be slightly formatted for the article. Originally, the !_options.UseDatabaseNames calls and the extra DatabaseName check are written in the same line.
I found this fragment with the help of the following analyzer warning:
V3063 A part of conditional expression is always true if it is evaluated: !string.IsNullOrEmpty(databaseModel.DatabaseName).
protected virtual IEnumerable<IReadOnlyModificationCommand>
GenerateModificationCommands(InsertDataOperation operation,
IModel? model)
{
....
if ( operation.ColumnTypes != null
&& operation.Columns.Length != operation.ColumnTypes.Length)
{
throw new InvalidOperationException(
RelationalStrings.InsertDataOperationTypesCountMismatch(
....,
FormatTable(operation.Table,
operation.Schema ?? model?.GetDefaultSchema())));
}
if (operation.ColumnTypes == null && model == null)
{
throw new InvalidOperationException(
RelationalStrings.InsertDataOperationNoModel(
FormatTable(operation.Table,
operation.Schema ?? model?.GetDefaultSchema())));
}
....
}
Another great example of inconsistency and a lack of thought when writing the code. Indeed, who cares what the exception is even about when you can just copy and paste :). The check makes sure that model is null, then we throw an exception that there is no model. Why not refer to the model when making the exception? :)
It's not that referring to the model will cause an exception to be thrown here. The point is that the programmer (I guess so) wrote this code without thinking about its meaning. Considering that this is one of the most well-known and widely-used .NET libraries, such an approach to development is troubling.
Another theory is that there may be an error in the check and '||' should have been used instead of '&&'. Although, that's just speculation.
I found the oddity described here thanks to the V3022 diagnostic rule:
Expression 'model' is always null. The operator '?.' is meaningless.
public virtual SqlExpression? Translate(....)
{
if (SupportedMethods.TryGetValue(method, out var sqlFunctionName))
{
RelationalTypeMapping? typeMapping;
List<SqlExpression>? newArguments = null;
if (sqlFunctionName == "max" || sqlFunctionName == "max") // <=
{
typeMapping = ExpressionExtensions.InferTypeMapping(arguments[0],
arguments[1]);
newArguments = new List<SqlExpression>
{
....
};
}
else
{
typeMapping = arguments[0].TypeMapping;
}
var finalArguments = newArguments ?? arguments;
return _sqlExpressionFactory.Function(....);
}
return null;
}
This is where things can get dangerous. It looks like some special handling is implemented for the max function. Well, the developers apparently wanted to do something similar for another function. I think it might be for the min function, but it's hard to say for sure.
Unfortunately, the code looks like it's asking again: "Maybe the function name is max after all?" (if the first comparison returns false). Well, what if something has really changed? :)
PVS-Studio, which was used to search for this oddity, gives the following comment:
V3001 There are identical sub-expressions 'sqlFunctionName == "max"' to the left and to the right of the '||' operator.
public override Expression Visit(Expression expression)
{
....
if ( TryGetPartitionKeyProperty(entityType,
out var partitionKeyProperty)
&& entityTypePrimaryKeyProperties.SequenceEqual(queryProperties)
&& (partitionKeyProperty == null || ....)
&& ....)
{
var propertyParameterList = queryProperties.Zip(....);
var readItemExpression = new ReadItemExpression(entityType,
propertyParameterList);
return ....;
}
....
}
Take a closer look at the code above. Can you see anything weird there?
Well, I can't see it either :). It may not really be an error at all, but it's a curious case nonetheless.
The call to the TryGetPartitionKeyProperty function is strange here. Theoretically, it works in the usual Try-methods pattern: the function returns true if the operation was successful, otherwise it returns false.
If you thought the same thing I did, congratulations — the code fooled you. The TryGetPartitionKeyProperty method looks as follows:
static bool TryGetPartitionKeyProperty(IEntityType entityType,
out IProperty partitionKeyProperty)
{
var partitionKeyPropertyName = entityType.GetPartitionKeyPropertyName();
if (partitionKeyPropertyName is null)
{
partitionKeyProperty = null;
return true;
}
partitionKeyProperty = entityType.FindProperty(partitionKeyPropertyName);
return true;
}
Well yes, functions written by successful developers always return success :).
The strange thing here is the following: if the value of partitionKeyPropertyName is not received, null is explicitly written to the out-parameter. The TryGetPartitionKeyProperty method is still considered successful (which sounds strange when you read it). There are two ways of looking at this point.
On the one hand, there is nothing to worry about. The code that calls the method is ready for null to be written to the out parameter:
if ( TryGetPartitionKeyProperty(entityType,
out var partitionKeyProperty)
&& entityTypePrimaryKeyProperties.SequenceEqual(queryProperties)
&& (partitionKeyProperty == null || ....)
&& ....)
The developers could rewrite the method, so that it would only return IProperty (without any Try, out-parameters, and also without returning a boolean value). Although it doesn't seem like a big deal.
On the other hand, it's not entirely clear what exactly this null check is doing here. For example, it could be added in case null returns a call to entityType.FindProperty(partitionKeyPropertyName).
I have a feeling that something is wrong here. It's very strange that the TryGetPartitionKeyProperty method returns true (as if saying: "Yes, we successfully got everything!") but at the same time sets the corresponding out-parameter to null.
I found this strange code snippet thanks to the PVS-Studio warning:
V3063 A part of conditional expression is always true if it is evaluated: TryGetPartitionKeyProperty(entityType, out var partitionKeyProperty).
protected override int Execute(string[] _)
{
....
var targetPlatformIdentifier = startupProject.TargetPlatformIdentifier!;
if ( targetPlatformIdentifier.Length != 0
&& !string.Equals(targetPlatformIdentifier, "Windows", ....))
{
executable = Path.Combine(
toolsPath,
"net461",
startupProject.PlatformTarget switch
{
"x86" => "win-x86",
"ARM64" => "win-arm64",
_ => "any"
},
"ef.exe");
}
executable = "dotnet";
args.Add("exec");
args.Add("--depsfile");
args.Add(depsFile);
....
return Exe.Run(executable, args, startupProject.ProjectDir);
}
I'd say this code fragment amused me more than the others. First, there is some serious value check, followed by the equally serious path building functionality that uses a complex Path.Combine call. And then all the results are simply wiped away :). All that's left is 'dotnet'.
I don't mind, but it looks weird. Even if there is no error here, we still have some big inconsistencies of logic. Looks like the developers inserted random lines into the code without looking at what is happening nearby. Perhaps this is the result of an unsuccessful branch merging.
I found this snippet thanks to a message from the V3008 diagnostic rule:
The 'executable' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 139, 127.
Despite all I've said, it's worth acknowledging that Entity Framework is indeed written by real professionals. Almost none of the presented warnings look like real bugs. And believe me, I tried very hard to find the best examples.
Of course, you could say that the analyzer just didn't do a good job here. However, it does find quite a few obvious issues in other projects. By the way, if you're interested in giving this tool a try, click the link to get it for free.
Well, that's it for me. I hope you've enjoyed reading this :).
0