Our website uses cookies to enhance your browsing experience.
Accept
to the top
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you do not see the email in your inbox, please check if it is filtered to one of the following folders:

  • Promotion
  • Updates
  • Spam

Webinar: C++ semantics - 06.11

>
>
>
Does C# always have boxing with string …

Does C# always have boxing with string concatenation and interpolation?

Aug 03 2023

The C# developers are familiar with the term "boxing". It can be either obvious or unnoticeable. For example, the addition of a value type and string can cause boxing. Or not. Something like "Schrödinger's boxing". Here we try to deal with this uncertainty.

1060_NoteAboutBoxing/image1.png

How we faced it

This topic did not appear out of thin air. The point is that I am a C# developer of the PVS-Studio static code analyzer. In 2023, as one of the development directions, we chose diagnostic rules for Unity Engine projects. In particular, we decided to implement diagnostics to indicate optimization opportunities.

We started with the diagnostic rule V4001. This rule defines the code that is executed quite often and points to the boxing inside it. Boxing is a computationally expensive process compared to passing by reference or value. That's why we decided to implement the functionality of searching for its use cases.

One of the cases that we considered was the boxing with string and value concatenation:

string Foo(int a)
{
  return "The value is " + a;
}

It seems that boxing is always here. But as we delved deeper, we understood that nothing was so obvious.

Where does boxing with concatenation even come from?

Boxing is performed when converting a variable of a value type to a variable of the Object type; or to an interface type implemented by this value type. This conversion can be explicit and implicit. The explicit type conversion is considered as the direct type cast:

var boxedInt = (object)1;

An implicit conversion occurs when a variable of a value type is used where a reference is expected (either a reference of the Object type or a reference to an interface implemented by this value type):

bool Foo(object obj, int number)
{
  return obj.Equals(number);
}

The Equals method is expecting the argument of the Object type, so the number value will be boxed when passed.

What happens with the concatenation? Visual Studio can answer this question:

1060_NoteAboutBoxing/image2.png

The operator receives Object as the right operand. It means that the a value will be boxed. At least it seems that way.

The truth is in IL code

Of course, we can't take the IDE's hints at face value. Let's take a look at what the above code does:

.method private hidebysig static void  Foo(string str,
                                           int32 a) cil managed
{
  ....
  IL_0001:  ldarg.0
  IL_0002:  ldarg.1
  IL_0003:  box      [mscorlib]System.Int32
  IL_0008:  call     string [mscorlib]System.String::Concat(object,
                                                            object)
  IL_000d:  stloc.0
  IL_000e:  ret
}

To make it easier for you to review the generated IL code, I have slightly shortened it. The main thing is to see the box instruction. It is the one that indicates the boxing of the a variable. You may also notice that the called String.Concat takes 2 references of the Object type (not String and Object as you might think). Anyway, the fact that the boxing occurs is undeniable.

All of the above seem logical, but despite this, boxing in the case of such concatenation will not be always performed.

But how could that be? We saw the box command in the IL code! Isn't that boxing? Let's take a look again at the result of the compilation.

.method private hidebysig static void  Foo(string str,
                                           int32 a) cil managed
{
  ....
  IL_0001:  ldarg.0
  IL_0002:  ldarga.s   a
  IL_0004:  call       instance string [mscorlib]System.Int32::ToString()
  IL_0009:  call       string [mscorlib]System.String::Concat(string,
                                                              string)
  IL_000e:  stloc.0
  IL_000f:  ret
}

As I mentioned earlier, there is no boxing here :).

Okay okay, attentive (and not so much) readers have probably noticed that the IL code is significantly different in these cases. Indeed, in the previous example, there was the boxing and the call of String.Concat(object, object). In this case, the ToString method is called for the numeric variable. After that, it's quite logical to use this method to concatenate two strings.

However, it's important to mention that the source code for both examples is the same.

What's the difference?

It's easy to guess that the difference is in the build algorithm. The thing is that starting from some version, the C# compiler began to optimize such concatenation automatically. I have noticed that if the code is compiled in Visual Studio 2019 or a newer version, the boxing won't happen. Then I decided to examine deeper and take a superficial look at the situation with different platforms.

The situation with .NET Framework projects is simple. If we use MSBuild from Visual Studio 2017 or earlier version to build a project, the boxing with concatenation is not optimized. At the same time, the version of the target platform does not matter (at least, choosing the latest version at the moment did not bring any optimizations).

NET Core 3.1 also provided this optimization. Again, please note that it doesn't matter what version of TargetFramework is set for a project. Everything depends on the SDK version.

I think the presence of the optimization for .NET 5 (and later) is not a surprise.

Runtime optimization

Some inquiring minds might suspect that just-in-time (JIT) compiler could eliminate the boxing with concatenation. And indeed, such optimization seems possible.

I have tested this on the project for .NET Framework. Sad but true, I didn't see any optimizations. If there was boxing in the resulting IL code, then it would indeed perform at runtime (the difference is very noticeable in the number of allocations).

If you are interested in this topic, and you decided to learn more about it, please share your catches in the comments :). And for now, I suggest we consider another interesting, related question.

Interpolation

So, we figured out what the boxing with concatenation is. And how does things stand with a similar operation — interpolation? After all, this is almost the same thing. It's combining different elements in one string. Actually, it's not like that at all. Firstly, it's worth noting that there are differences here depending on the chosen target platform.

.NET Framework

Let's take a look at another example:

void Foo(string str, int num)
{
  _ = $"{str} {num}";
}

Without any cheating, I'd like to point out that I compile this code from Visual Studio 2022 and don't do any unnatural acts :). Let's check the result:

.method private hidebysig instance void  Foo(string str,
                                             int32 num) cil managed
{
  ....
  IL_0001:  ldstr      "{0} {1}"
  IL_0006:  ldarg.1
  IL_0007:  ldarg.2
  IL_0008:  box        [mscorlib]System.Int32
  IL_000d:  call       string [mscorlib]System.String::Format(string,
                                                              object,
                                                              object)
  IL_0012:  pop
  IL_0013:  ret
}

Well, the result is disappointing. We see that in the case of interpolation, the boxing even with a new compiler version goes nowhere.

Let's try to call the ToString:

1060_NoteAboutBoxing/image3.png

The IDE0071 rule installed to Visual Studio suggests deleting the "useless" ToString call. However, the benefit of such a call is obvious:

.method private hidebysig instance void  Foo(string str,
                                             int32 num) cil managed
{
  ....
  IL_0001:  ldarg.1
  IL_0002:  ldstr      " "
  IL_0007:  ldarga.s   num
  IL_0009:  call       instance string [mscorlib]System.Int32::ToString()
  IL_000e:  call       string [mscorlib]System.String::Concat(string,
                                                              string,
                                                              string)
  IL_0013:  pop
  IL_0014:  ret
}

There is no boxing. Moreover, there is no even the String.Format call. The code became the concatenation of 3 strings.

.NET Core and .NET

Let's consider the behavior on these platforms in the same example:

void Foo(string str, int num)
{
  _ = $"{str} {num}";
}

Here, the experiments show that the optimization depends only on the target platform of the project. If the project is oriented towards .NET Core or .NET 5, the IL code is formed in the same way as with .NET Framework. In the other words, there are no any optimizations, the boxing is performed and then String.Format is called.

If the project is oriented towards .NET 6 and newer versions, the compilation result significantly differs:

.method private hidebysig instance void  Foo(string str,
                                             int32 num) cil managed
{
  ....
  .locals init (valuetype DefaultInterpolatedStringHandler V_0)
  IL_0000:  nop
  IL_0001:  ldloca.s V_0
  IL_0003:  ldc.i4.1
  IL_0004:  ldc.i4.2
  IL_0005:  .... DefaultInterpolatedStringHandler::.ctor(int32, int32)
  IL_000a:  ldloca.s   V_0
  IL_000c:  ldarg.1
  IL_000d:  .... DefaultInterpolatedStringHandler::AppendFormatted(string)
  IL_0012:  nop
  IL_0013:  ldloca.s V_0
  IL_0015:  ldstr " "
  IL_001a:  .... DefaultInterpolatedStringHandler::AppendLiteral(string)
  IL_001f:  nop
  IL_0020:  ldloca.s   V_0
  IL_0022:  ldarg.2
  IL_0023:  .... DefaultInterpolatedStringHandler::AppendFormatted<int32>(!!0)
  IL_0028:  nop
  IL_0029:  ldloca.s   V_0
  IL_002b:  .... DefaultInterpolatedStringHandler::ToStringAndClear()
  IL_0030:  pop
  IL_0031:  ret
}

To make this easier to read, I significantly shortened the code. To put it mildly, everything has become a little more complicated than a simple String.Format call:). Instead of it, the DefaultInterpolatedStringHandler structure is used to form the string. Studying the effectiveness of this approach goes beyond the scope of this article. However, something clearly catches the eye (unless your eyes are not tired from the amount of IL code, of course).

Pay attention to this call: DefaultInterpolatedStringHandler::AppendFormatted<int32>(!!0). Not going to lie, I have no idea what this "!!0" is, but the generic parameter indicates that there will be no boxing of a value type here.

Well, .NET 6 rocks! :)

Conclusion

In general, if we use old versions of the compiler, the boxing with concatenation is really there. So, the good idea is to use the ToString calls. New versions won't perform any boxing anyway (I hope no one is going to torture candidates with such questions during job interviews).

Only if the project targets .NET 6 and later versions, the interpolation is protected from boxing. In other cases, the ToString call can be quite useful for interpolation elements.

Thank you for your attention! Just a friendly reminder — I am a developer of the PVS-Studio analyzer. This tool helps find different errors in the code. If you would like to try it, you can get a free trial here. Good luck!



Comments (0)

Next comments next comments
close comment form