Lesson 24. Phantom errors
We have finished studying the patterns of 64-bit errors and the last thing we will speak about, concerning these errors, is in what ways they may occur in programs.
The point is that it is not so easy to show you by an example, as in the following code sample, that the 64-bit code will cause an error when "N" takes large values:
size_t N = ...
for (int i = 0; i != N; ++i)
{
...
}
You may try such a simple sample and see that it works. What matters is the way the optimizing compiler will build the code. It depends upon the size of the loop's body if the code will work or not. In examples it is always small and 64-bit registers may be used for counters. In real programs with large loop bodies an error easily occurs when the compiler saves the value of "i" variable in memory. And now let us make it out what the incomprehensible text you have just read means.
When describing the errors, we often used the term "a potential error" or the phrase "an error may occur". In general, it is explained by the fact that one and the same code may be considered both correct and incorrect depending upon its purpose. Here is a simple example - using a variable of "int" type to index array items. If we address an array of graphics windows with this variable, everything is okay. We do not need to, or, rather, simply cannot work with billions of windows. But when we use a variable of "int" type to index array items in 64-bit mathematical programs or databases, we may encounter troubles when the number of the items excesses the range 0..INT_MAX.
But there is one more, subtler, reason for calling the errors "potential": whether an error reveals itself or not depends not only upon the input data but the mood of the compiler's optimizer. Most of the errors we have considered in our lessons easily reveal themselves in debug-versions and remain "potential" in release-versions. But not every program built in the debug mode can be debugged at large data amounts. There might be a case when the debug-version is tested only at small data sets while the exhaustive testing and final user testing at real data are performed in the release-version where the errors may stay hidden.
We encountered the specifics of optimizing Visual C++ 2005 compiler for the first time when preparing the program OmniSample. This is a project included into the PVS-Studio distribution kit which is intended for demonstrating all the errors diagnosed by Viva64 analyzer. The samples included into this project must work correctly in the 32-bit mode and cause errors in the 64-bit mode. Everything was alright in the debug-version but the release-version caused some troubles. The code that must have hung or led to a crash in the 64-bit mode worked! The reason lay in optimization. The way out was found in excessive complication of the samples' codes with additional constructs and adding the key words "volatile" that you may see in the code of the project OmniSample.
The same is with Visual C++ 2008/2010. Of course the code will be a bit different but everything that we will write here may be applied both to Visual C++ 2005 and Visual C++ 2008/2010.
If you find it quite good when some errors do not reveal themselves, put this idea out of your head. Code with such errors becomes very unstable. Any subtle change not even related to the error directly may cause changes in the program behavior. I want to point it out just in case that it is not the compiler's fault - the reason is in the hidden code defects. Further we will show you some samples with phantom errors that disappear and appear again with subtle code changes in release-versions and hunt for which might be very long and tiresome.
Consider the first code sample that works in the release-version although it must not:
int index = 0;
size_t arraySize = ...;
for (size_t i = 0; i != arraySize; i++)
array[index++] = BYTE(i);
This code correctly fills the whole array with values even if the array's size is much larger than INT_MAX. It is impossible theoretically because the variable index has "int" type. Some time later an overflow must lead to accessing the items by a negative index. But optimization gives us the following code:
0000000140001040 mov byte ptr [rcx+rax],cl
0000000140001043 add rcx,1
0000000140001047 cmp rcx,rbx
000000014000104A jne wmain+40h (140001040h)
As you may see, 64-bit registers are used and there is no overflow. But let us make a slightest alteration of the code:
int index = 0;
size_t arraySize = ...;
for (size_t i = 0; i != arraySize; i++)
{
array[index] = BYTE(index);
++index;
}
Suppose the code looks nicer this way. I think you will agree that it remains the same from the viewpoint of the functionality. But the result will be quite different - a program crash. Consider the code generated by the compiler:
0000000140001040 movsxd rcx,r8d
0000000140001043 mov byte ptr [rcx+rbx],r8b
0000000140001047 add r8d,1
000000014000104B sub rax,1
000000014000104F jne wmain+40h (140001040h)
It is that very overflow that must have been in the previous example. The value of the register r8d = 0x80000000 is extended in rcx as 0xffffffff80000000. The result is the writing outside the array.
Here is another example of optimization and how easy it is to spoil everything:
unsigned index = 0;
for (size_t i = 0; i != arraySize; ++i) {
array[index++] = 1;
if (array[i] != 1) {
printf("Error\n");
break;
}
}
This is the assembler code:
0000000140001040 mov byte ptr [rdx],1
0000000140001043 add rdx,1
0000000140001047 cmp byte ptr [rcx+rax],1
000000014000104B jne wmain+58h (140001058h)
000000014000104D add rcx,1
0000000140001051 cmp rcx,rdi
0000000140001054 jne wmain+40h (140001040h)
The compiler has decided to use the 64-bit register rdx to store the variable index. As a result, the code can correctly process an array with a size more than UINT_MAX.
But the peace is fragile. Just make the code a bit more complex and it will become incorrect:
volatile unsigned volatileVar = 1;
...
unsigned index = 0;
for (size_t i = 0; i != arraySize; ++i) {
array[index] = 1;
index += volatileVar;
if (array[i] != 1) {
printf("Error\n");
break;
}
}
The result of using the expression "index += volatileVar;" instead of "index++" is that 32-bit registers start participating in the code and cause the overflows:
0000000140001040 mov ecx,r8d
0000000140001043 add r8d,dword ptr [volatileVar (140003020h)]
000000014000104A mov byte ptr [rcx+rax],1
000000014000104E cmp byte ptr [rdx+rax],1
0000000140001052 jne wmain+5Fh (14000105Fh)
0000000140001054 add rdx,1
0000000140001058 cmp rdx,rdi
000000014000105B jne wmain+40h (140001040h)
In the end let us consider an interesting but large example. Unfortunately, we cannot make it shorter because we need to preserve the necessary behavior to show you. It is the impossibility to predict what a slight change in the code might lead to why these errors are especially dangerous.
ptrdiff_t UnsafeCalcIndex(int x, int y, int width) {
int result = x + y * width;
return result;
}
...
int domainWidth = 50000;
int domainHeght = 50000;
for (int x = 0; x != domainWidth; ++x)
for (int y = 0; y != domainHeght; ++y)
array[UnsafeCalcIndex(x, y, domainWidth)] = 1;
This code cannot fill the array consisting of 50000*50000 items correctly. It cannot do so because an overflow must occur when calculating the expression "int result = x + y * width;".
Thanks to a miracle, the array is filled correctly in the release-version. The function UnsafeCalcIndex is integrated into the loop where 64-bit registers are used:
0000000140001052 test rsi,rsi
0000000140001055 je wmain+6Ch (14000106Ch)
0000000140001057 lea rcx,[r9+rax]
000000014000105B mov rdx,rsi
000000014000105E xchg ax,ax
0000000140001060 mov byte ptr [rcx],1
0000000140001063 add rcx,rbx
0000000140001066 sub rdx,1
000000014000106A jne wmain+60h (140001060h)
000000014000106C add r9,1
0000000140001070 cmp r9,rbx
0000000140001073 jne wmain+52h (140001052h)
All this happened because the function UnsafeCalcIndex is simple and can be easily integrated. But when you make it a bit more complex or the compiler supposes that it should not be integrated, an error will occur that will reveal itself at large data amounts.
Let us modify (complicate) the function UnsafeCalcIndex a bit. Note that the function's logic has not been changed in the least:
ptrdiff_t UnsafeCalcIndex(int x, int y, int width) {
int result = 0;
if (width != 0)
result = y * width;
return result + x;
}
The result is a crash, when an access outside the array is performed:
0000000140001050 test esi,esi
0000000140001052 je wmain+7Ah (14000107Ah)
0000000140001054 mov r8d,ecx
0000000140001057 mov r9d,esi
000000014000105A xchg ax,ax
000000014000105D xchg ax,ax
0000000140001060 mov eax,ecx
0000000140001062 test ebx,ebx
0000000140001064 cmovne eax,r8d
0000000140001068 add r8d,ebx
000000014000106B cdqe
000000014000106D add rax,rdx
0000000140001070 sub r9,1
0000000140001074 mov byte ptr [rax+rdi],1
0000000140001078 jne wmain+60h (140001060h)
000000014000107A add rdx,1
000000014000107E cmp rdx,r12
0000000140001081 jne wmain+50h (140001050h)
I hope we have managed to show you how a 64-bit program that works might easily stop doing that after adding harmless corrections into it or building it with a different version of the compiler.
You will also understand some strange things and peculiarities of the code in OmniSample project which are made specially to demonstrate an error in simple examples even in the code optimization mode.
The course authors: Andrey Karpov (karpov@viva64.com), Evgeniy Ryzhkov (evg@viva64.com).
The rightholder of the course "Lessons on development of 64-bit C/C++ applications" is OOO "Program Verification Systems". The company develops software in the sphere of source program code analysis. The company's site: http://www.viva64.com.