Appearance of 64-bit processors on PC market made developers face the task of converting old 32-bit applications for new platforms. After the migration of the application code it is highly probable that the code will work incorrectly. This article reviews questions related to software verification and testing. It also concerns difficulties a developer of 64-bit Windows application may face and the ways of solving them.
Appearance of 64-bit processors is the next step in the computer technologies evolution. However, one can get all the advantages of the new 64-bit hardware only using new instructions set and registers. For programs written in C/C++ it means the necessity of their recompilation. During this operation, the sizes of data types are changed and that causes unexpected errors in when these applications work on 64-bit systems [1].
Problems occurring during code conversion are typical mainly of those applications which are written in low-level programming languages such as C and C++. In languages with precisely structured type system (for example, .NET Framework) these problems don't occur as a rule.
Let's pose a task. It's necessary to make sure that a 64-bit application has the same behavior as a 32-bit one after the recompilation (except the obvious architecture changes). We'll call the process of testing workability of a 64-bit version of the program 'verification'.
In the next part of the article we'll describe the main methods of testing and verification of applications. Those who are familiar with these methods may skip the next section and go to the third part which deals with peculiarities of testing methods usage for 64-bit systems.
There are various approaches to provide correctness of applications code, some of them can be automated and others cannot. Those which cannot be automated are manual code review, white-box testing, manual testing, etc. Static code analyzers and black-box testing are methods which can be automated. Let's examine these methods in detail.
The oldest, the most approved and reliable approach of errors search is code review. This method is based on team-reading of the code with observation of some rules and recommendations [2]. Unfortunately, this practice cannot be used for wide testing of modern program systems because of their large size. Although this method provides the best results it is not always used in circumstances of modern life-cycles of software development, where the term of development and product release is a very important factor. That's why code review looks like rare meetings which aim at teaching new and less experienced employees to write quality code rather than at testing workability of some modules. This is a very good way of programmer's skill-level raising but it cannot be treated as a full means of quality assurance.
Static code analyzers help developers who realize the necessity of regular code review but don't have enough time for that [3]. Their main purpose is to reduce the amount of code which should be examined by a programmer and thus to reduce the review time. Static code analyzers are a large class of programs, which are implemented for different programming languages and have a various set of functions - from the simplest code alignment to complex analysis of potentially dangerous places. Systematized use of static analyzers allows greatly improving the code quality and finding many errors. The static analysis approach has many supporters and are many interesting articles on this approach. The advantage of this approach is that it can be used without taking into account the complexity and size of the program solution developed.
Dynamic code analysis is the software analysis provided while executing programs on a real or virtual processor. Dynamic analysis is often understood as examination of program code aiming at its optimization. But we will treat dynamic analysis as a method of program testing.
Dynamic analysis doesn't allow finding a lot of errors for it is often impossible to execute the whole program code, or the sequence of its execution differs greatly from the real system. Beside, dynamic analysis causes computational burden during the execution. That's why the thorough (i.e. computationally complex) gathering of profiling information is usually postponed till the end of the profiled program execution. All this doesn't make this method attractive especially in case you need to test an application with large data sizes where 64-bit systems are most often used.
The method of white-box testing is the execution of the maximum number of accessible code branches with the help of debugger or other means. The more code coverage is achieved the fuller is the testing provided. The white-box testing method is also sometimes understood as a simple debugging in order to find a certain bug. The full testing of the whole program code by means of the white-box method became impossible long ago due to the enormous size of the code of modern programs. Nowadays the white-box testing method is convenient at the step when the error is found and you should find out the reason which caused it. The white-box testing method has its opponents denying the efficiency of real-time program debugging. The main reason is that the possibility to watch the program work and simultaneously to make changes in it, is an unacceptable approach in programming based on great number of code corrections by means of the 'cut and try' method. We won't touch upon these disputes but will mention that the white-box testing method is in any case a very expensive way to improve the quality of large and complex program systems.
The black-box method has a better reputation. Unit testing may also be treated as black-box testing. The main idea of the method consists in writing a set of tests for separate modules and functions, which test all the main modes of their work. Some sources refer unit testing to the white-box method because it is based on the familiarity with the program structure. But functions and modules shouldn't be treated as black-boxes because unit tests shouldn't take into account the inner organization of a function. The argument for this approach is the development methodology when tests are developed before writing the functions. This improves the control of their functionality from the specification viewpoint.
Unit testing has earned good reputation during the development of simple projects as well as complex ones. One of the advantages of unit testing is that it is possible to check the correctness of changes made in the program immediately during the development. Programmers try to do so that all the tests take some minutes so that the developer who has made corrections in the code, can notice an error immediately and correct it. If running all the tests is impossible long tests are usually launched separately, for example, at night. This also contributes to quick error detection, on the next morning at least.
This is probably the final step of any development but it shouldn't be treated as a good and reliable method. Manual testing should exist because it is impossible to detect all the errors in automatic mode or through the code review. But if a program is of a low quality and has a lot of inner defects, its testing and correction may take too long and still it is impossible to provide the appropriate quality of the program. The only method to get a quality program is the quality code. That's why we won't consider manual testing a full method during the development of large projects.
So, what deserves the greatest attention during the development of large program systems? This is static analysis and unit tests. These approaches can improve the quality and reliability of the program code and we should pay them the greatest attention, although one shouldn't forget about other methods of course.
Let's go on to the problem of testing of 64-bit programs, because the use of the methods we have chosen makes us face some unpleasant difficulties.
Strange as it may seem, static analyzers appeared to be badly prepared to detect errors in 64-bit programs despite all their great possibilities, long period of development and use practice. Let's examine the situation on the example of C++ code analysis as a sphere where static analyzers are used mostly. Many static analyzers follow the set of rules related to detection of the code which behaves incorrectly during its migration on 64-bit systems. But they do it rather uncoordinatedly and incompletely. It became especially evident when the wide development of applications for the 64-bit version of Windows operating system in the Microsoft Visual C++ 2005 environment began.
It may be explained by the fact that most tests are based on rather old materials on the research of problems of converting programs on 64-bit systems from the viewpoint of C language. As a result some constructions which have appeared in C++ language were not taken into account from the portability control point of view and weren't implied into analyzers [4]. Besides, some other changes were not taken into account too. For example, the RAM size, which has risen greatly, and the use of different data models in different compilers. A data model is a correlation of sizes of basic types in a programming language (see table 1). In 64-bit Unix-systems use the LP64 or ILP64 data models, and Windows use the LLP64 model. You can learn about data models in detail in the source [5].
ILP32 |
LP64 |
LLP64 |
ILP64 |
|
---|---|---|---|---|
char |
8 |
8 |
8 |
8 |
short |
16 |
16 |
16 |
16 |
int |
32 |
32 |
32 |
64 |
long |
32 |
64 |
32 |
64 |
long long |
64 |
64 |
64 |
64 |
size_t, ptrdiff_t |
32 |
64 |
64 |
64 |
pointers |
32 |
64 |
64 |
64 |
Table 1. Sizes of data types in different data models.
To see it clearly let's examine several examples.
double *BigArray;
int Index = 0;
while (...)
BigArray[Index++] = 3.14;
It is hard to get a diagnostic warning on such code by means of static analysis. It's no wonder. The given code doesn't make an ordinary developer suspect anything, as he is accustomed to use variables of int and unsigned types as indexes for arrays. Unfortunately the given code won't work on a 64-bit system if the BigArray array size exceeds the size of four Gb of items. In this case an overflow of the Index variable will occur and the result of the program execution will be incorrect. The correct variant is the use of size_t type in programming for Windows x64 (LLP64 data model) or size_t/unsigned long type in programming for Linux (LP64 data model).
The reason why static analyzers cannot diagnose such code is probably the fact that hardly has anyone imagined that there can be arrays of more than 4 billion items at the time when questions of migration on 64-bit systems was under research. And 4 billion items of double type is 4 * 8 = 32 GB of memory for one array. It's an enormous size, especially if we take into account the time - 1993-1995s. It is that period when most issues and discussions devoted to the use of 64-bit systems took place.
As a result nobody paid attention to the possible incorrect indexation when using int type, and later on the migration problems have been rather rarely researched.
Let's examine another example.
char *pointer;
long g=(long)(pointer);
With the help of this simple example you can check which data models can be understood by the static analyzer that you use. The problem is that most of them are meant for the LP64 data model only. Again it is due to the history of the 64-bit systems development. It is the LP64 data model that has gained the highest popularity at the first stages of the development of 64-bit systems and is now widely used in Unix-world. Long type in this data model has the size of 8 bytes and it means that this code is absolutely correct. However, 64-bit Windows systems use the LLP64 data model and in this model the size of the long type remains 4-byte and the given code is incorrect. In such cases the LONG_PTR or ptrdiff_t types are used in Windows.
Fortunately, the given code will be detected as dangerous even by the Microsoft Visual C++ 2005 compiler. But you should always keep in mind such traps while using static analyzers.
We have now an interesting situation. The question of program conversion on 64-bit systems was discussed in detail, different methods and rules of testing by static analyzers applied, and after that the interest for this theme was lost. Many years passed, many things have changed, but the rules according to which the analysis is performed remain unchanged and unmodified. It is difficult to say why it is so. Perhaps, developers simply don't notice the changes, supposing that the question of testing of 64-bit applications has been solved long ago. But what was relevant 10 years ago may not be so now, and many new things have appeared. If you use a static analyzer, make sure that it is compatible with the 64-bit data model that you use. If the analyzer doesn't meet the necessary demands don't be lazy to search for another one and fill the gap using a highly specialized analyzer. Efforts spent on this will be compensated by increased program reliability, reduced time of debugging and testing.
For Unix systems with the LP64 data model such an analyzer may be represented by one of such famous tools as Gimpel Software PC-Lint or Parasoft C++test, and for Windows with LLP64 model by a specialized analyzer Viva64 [6].
Now let's speak about unit tests. Developers who use them on 64-bit systems will face some unpleasant moments too. Aiming at reducing the time of tests accomplishment one tries to use little amount of computing and data processed during their development. For example, when a test with an array item searching function is developed it doesn't matter if it will process 100 or 10,000,000 items. A hundred items will be enough and in comparison with processing of 10,000,000 items the test will be accomplished much quicker. But if you want to develop full tests to check this function on a 64-bit system you will need to process more than 4 billion items! Does it seem to you that if the function works with 100 items it will work with billions too? No. Here is a sample code, which you can try on a 64-bit system.
bool FooFind(char *Array, char Value,
size_t Size)
{
for (unsigned i = 0; i != Size; ++i)
if (i % 5 == 0 && Array[i] == Value)
return true;
return false;
}
#ifdef _WIN64
const size_t BufSize = 5368709120ui64;
#else
const size_t BufSize = 5242880;
#endif
int _tmain(int, _TCHAR *) {
char *Array =
(char *)calloc(BufSize, sizeof(char));
if (Array == NULL)
std::cout << "Error allocate memory";
if (FooFind(Array, 33, BufSize))
std::cout << "Find";
free(Array);
}
The incorrectness of the code is in the occurrence of an infinite cycle as far as the counter variable 'i' won't exceed UINT_MAX value and the condition 'i != Size' won't be fulfilled.
As it is seen from the example you shouldn't rely on old sets of unit tests if your program begins to process large amount of data on a 64-bit system. You should expand the tests taking into account the processing of large amount of data.
Unfortunately, it is not enough to create new tests. Here we face the problem of the accomplishment speed of a modified set of tests, which cover the processing of large amount of data. The first consequence is that you won't be able to add such tests into the set of tests launched by a programmer during the development. On adding them into night tests some difficulties may also appear. The total time of accomplishment of all the tests may increase in one or two degrees, or even more. As a result the test may last more than even 24 hours. You should keep it in mind and treat the rework of tests for the 64-bit version of a program very seriously.
The way out is the division of all the tests into several groups which are to be launched simultaneously on several computers. You can also use multiprocessor systems. Of course, it will complicate the testing system a bit and will require additional hardware resources, but it will be the most correct thing and thus the simplest way of solving the task of creating a unit testing system.
Surely, you will need to use an automated testing system, which will allow you to launch the tests on several computers. The example is the AutomatedQA TestComplete automated testing system for Windows applications. With its help you can provide distributed testing of applications on several workstations, synchronization and gathering of the results.
At the end, we would like to return to the question of the white-box testing method, which we considered to be unacceptable for large systems. We should add that this method becomes even more unacceptable for debugging 64-bit applications, which process large arrays. Debugging of such applications may take much more time or be difficult on developer's computers. That's why you should think over the possibility of using logging systems for debugging applications and use other methods, for example, remote debugging in case when several computers are used for debugging.
To sum it up, we would like to say that you shouldn't rely only on one method. A quality application may be developed only when several of the discussed approaches to testing and verification are used. What is more, you should think about these methods before you start to convert the code on a new architecture so that you could control the application quality at once.
Summarizing the problems of developing and testing 64-bit systems we would like to remind you of some key moments:
0