Some believe that experienced developers do not make silly errors. Comparison errors? Dereferencing null references? Bet you think: "No, it's definitely not about me..." ;) By the way, what about errors with sorting? As the title suggests, there are some nuances.
Let me give you an example to describe the problem. Let's say we have some type (Wrapper) with two integer properties (Primary and Secondary). There's an array of instances of this type. We need to sort it in ascending order. First — by the primary key, then — by the secondary key.
Here's the code:
class Wrapper
{
public int Primary { get; init; }
public int Secondary { get; init; }
}
var arr = new Wrapper[]
{
new() { Primary = 1, Secondary = 2 },
new() { Primary = 0, Secondary = 1 },
new() { Primary = 2, Secondary = 1 },
new() { Primary = 2, Secondary = 0 },
new() { Primary = 0, Secondary = 2 },
new() { Primary = 0, Secondary = 3 },
};
var sorted = arr.OrderBy(p => p.Primary)
.OrderBy(p => p.Secondary);
foreach (var wrapper in sorted)
{
Console.WriteLine($"Primary: {wrapper.Primary}
Secondary: {wrapper.Secondary}");
}
Unfortunately, the result of this code will be incorrect:
Primary: 2 Secondary: 0
Primary: 0 Secondary: 1
Primary: 2 Secondary: 1
Primary: 0 Secondary: 2
Primary: 1 Secondary: 2
Primary: 0 Secondary: 3
The sequence turned out to be sorted by the secondary key. But the sorting by primary key was not saved. If you've ever used multilevel sorting in C#, you can guess what the catch is.
The second OrderBy method call introduces a new primary ordering. This means that all the sequence will be sorted again.
But we need to fix the result of primary sorting. The secondary sorting should not reset it.
In this case the correct sequence of calls is OrderBy(...).ThenBy(...):
var sorted = arr.OrderBy(p => p.Primary)
.ThenBy(p => p.Secondary);
Then the code produces the expected result:
Primary: 0 Secondary: 1
Primary: 0 Secondary: 2
Primary: 0 Secondary: 3
Primary: 1 Secondary: 2
Primary: 2 Secondary: 0
Primary: 2 Secondary: 1
Microsoft has the documentation for the ThenBy method. There's a note about this: Because IOrderedEnumerable<TElement> inherits from IEnumerable<T>, you can call OrderBy or OrderByDescending on the results of a call to OrderBy, OrderByDescending, ThenBy or ThenByDescending. Doing this introduces a new primary ordering that ignores the previously established ordering.
Recently, I looked through C# projects on GitHub and chose some to check with PVS-Studio. The analyzer has the V3078 diagnostic concerning the possible misuse of OrderBy.
Want to know what I found? ;)
In Unity, the analyzer found 2 similar code fragments.
The first fragment
private List<T> GetChildrenRecursively(bool sorted = false,
List<T> result = null)
{
if (result == null)
result = new List<T>();
if (m_Children.Any())
{
var children
= sorted ?
(IEnumerable<MenuItemsTree<T>>)m_Children.OrderBy(c => c.key)
.OrderBy(c => c.m_Priority)
: m_Children;
....
}
....
}
Perhaps, the developers wanted to sort the m_Children collection first by key (c.key), then by priority (c.priority). But sorting by priority will be performed on the entire collection. Sorting by key will not be fixed. Is this an error? Here we need to ask the developers.
The second fragment
static class SelectorManager
{
public static List<SearchSelector> selectors { get; private set; }
....
internal static void RefreshSelectors()
{
....
selectors
= ReflectionUtils.LoadAllMethodsWithAttribute(
generator,
supportedSignatures,
ReflectionUtils.AttributeLoaderBehavior.DoNotThrowOnValidation)
.Where(s => s.valid)
.OrderBy(s => s.priority)
.OrderBy(s => string.IsNullOrEmpty(s.provider))
.ToList();
}
}
The sorting results in the following order:
Perhaps, there is no error here. However, agree that the sequence of the OrderBy().ThenBy() calls is easier to read.
.OrderBy(s => string.IsNullOrEmpty(s.provider))
.ThenBy(s => s.priority)
I reported both issues via Unity Bug Reporter. After this, Unity QA Team opened 2 issues.
Issues don't contain any comments yet. So, we are still waiting for any updates.
PVS-Studio found 3 places in ASP.NET Core with duplicated OrderBy calls. All were detected in the KnownHeaders.cs file.
The first issue
RequestHeaders = commonHeaders.Concat(new[]
{
HeaderNames.Authority,
HeaderNames.Method,
....
}
.Concat(corsRequestHeaders)
.OrderBy(header => header)
.OrderBy(header => !requestPrimaryHeaders.Contains(header))
....
The second issue
ResponseHeaders = commonHeaders.Concat(new[]
{
HeaderNames.AcceptRanges,
HeaderNames.Age,
....
})
.Concat(corsResponseHeaders)
.OrderBy(header => header)
.OrderBy(header => !responsePrimaryHeaders.Contains(header))
....
The third issue
ResponseTrailers = new[]
{
HeaderNames.ETag,
HeaderNames.GrpcMessage,
HeaderNames.GrpcStatus
}
.OrderBy(header => header)
.OrderBy(header => !responsePrimaryHeaders.Contains(header))
....
The error pattern is the same, only the used variables are different. To report these issues, I created a new issue on the project page.
Developers answered that duplicated OrderBy calls aren't bugs. Nevertheless, they've fixed the code. You can find a commit here.
In any case, I think that you should not write code in such a way. Duplicated OrderBy calls look very suspicious.
private Dictionary<MethodBase, int?> mBootEntries;
private void LoadBootEntries()
{
....
mBootEntries = mBootEntries.OrderBy(e => e.Value)
.OrderByDescending(e => e.Value.HasValue)
.ToDictionary(e => e.Key, e => e.Value);
....
}
Here we're dealing with a strange sorting by the fields of the int? type. I also created an issue for this. In this case, the secondary sorting turned out to be redundant. That's why the developers deleted the OrderByDescending call. You can find the commit here.
public IEnumerable<IMigration> GetCurrentMigrations()
{
var currentDbVersion = new DbVersion(int.Parse(GrandVersion.MajorVersion),
int.Parse(GrandVersion.MinorVersion));
return GetAllMigrations()
.Where(x => currentDbVersion.CompareTo(x.Version) >= 0)
.OrderBy(mg => mg.Version.ToString())
.OrderBy(mg => mg.Priority)
.ToList();
}
Perhaps, the developers wanted to perform sorting first by version, then — by priority.
As with the previous issues, I informed the developers. They fixed this by replacing the second OrderBy call with ThenBy:
.OrderBy(mg => mg.Version.ToString())
.ThenBy(mg => mg.Priority)
You can find the fix here.
The sequence of OrderBy().OrderBy() calls may not be an error. But such code provokes questions. Is it correct? What if OrderBy().ThenBy() should be used here?
How can developers make such errors?
Perhaps, it is a human reliability. We know that developers tend to make errors in comparison functions. Also, there's the last line effect. Moreover, copy-paste often provokes errors. Perhaps the multiple OrderBy call is another manifestation of human reliability.
Anyway, be careful with this. :)
Following a good tradition, I invite you to follow me on Twitter so as not to miss interesting publications.
Finally, please tell me: have you encountered a similar pattern?