Testing parallel programs

Nov 02 2008

Author: Andrey Karpov

Introduction
1. Parallel program testing difficulties
2. Methods of searching errors in parallel programs
3. New technologies - new tools
Conclusion
References

Testing parallel software is a more complicated task in comparison to testing a standard program. The programmer should be aware both of the traps he can face while testing parallel code and existing methodologies and toolkit.

Introduction

Speaking about software development we single out the step of testing and debugging an application. The notions of testing and debugging are understood as nearly the same. It is due to that after an error is found during the process of testing its correction often relates to the process of debugging of a program. And debugging of an application can be in its turn called "white-box" method testing. Sometimes the step of debugging is understood as simultaneous search and correction of errors.

Still, we'll distinguish between these two notions and concentrate on testing parallel software in this article.

Testing software is a process of detecting errors in software by using different tools and analysis of program workability by end users [1].

Debugging is a process of finding errors by the programmer in the code which were detected while testing the software to correct them. Debugging usually implies using specialized tools for tracking the program's state while executing it.

Debugging parallel programs is a thankless task demanding accuracy and use of specialized tools. A lot of articles are devoted to debugging parallel programs and even more should be devoted as this topic is very urgent because of rapid development of multicore systems and new technologies of creating parallel programs. In the sphere of toolkits there are also some problems. But before you could perform debugging you should find the error. Meanwhile some debugging methods not only find errors but also detect their locations. That's why let's consider testing.

1. Parallel program testing difficulties

Testing applications with parallel processing is not a simple task. Paralleling errors are difficult to detect because of parallel applications' non-determined behavior. Even if an error is detected it is often difficult to reproduce it. Besides, when the code is modified it is not so easy to make sure that the error is really corrected and not concealed. We can give it another name, that is the errors in a parallel program are typical "heisenbugs".

Heisenbug is a term used in programming to define a program error which disappears or changes its properties when trying to detect it [2]. This name is a pun and comes from a physics term "Heisenberg uncertainty principle" which is understood in everyday life as a change of an object being observed occurring in quantum mechanics. An example of such errors is errors which show up in the final program version ("release") but cannot be seen in debugging mode; synchronization errors in a multithread application are another example.

Thus, the task of parallel testing on a whole comes to the problem of creating diagnosing tools which minimally influence a program's behavior or creating all the necessary conditions for it to occur. That's why let's look at the classic methodologies from a new viewpoint.

2. Methods of searching errors in parallel programs

Methods of searching errors in parallel applications as well as in sequential ones can be divided into dynamic analysis, static analysis, model-based test and program's correctness proof.

Formal proof of a program's correctness is a very complicated procedure especially in the sphere of parallel programming and it is usually not used in industrial software development.

Dynamic analysis implies the necessity of launching an application and executing different sequences of operations which aim at detecting incorrect program behavior. A sequence of operations can be defined both by man during manual testing and when using different tools implementing load testing or, for example, check of data integrity. In case of parallel programs of the most interest are tools like Intel Thread Checker which provide an application's source code with monitoring and logging means which allow you to detect deadlocks (both explicit and potential), hangs, races etc [3 [RU]].

Static analysis handles only an application's program code without launching it. As an advantage of this method we can single out detailed and full coverage of the analyzed code. We should mention that in the sphere of parallel programming static analysis method is used rather rarely and is represented by few tools which are rather research tools than commercial products.

Model-based test is automatic generation of tests according to the defined rules. Model-based test allows you to formally ground absence of defects in the code part being tested on the basis of the rules of data conversion defined by the developer. Such an example is KISS tool developed in Microsoft Research for C parallel programs.

All the mentioned methods have disadvantages and this doesn't allow you to rely only on one of them when developing parallel programs.

Dynamic analysis demands launching programs, it is sensitive to the execution environment and it significantly slows down the speed of application execution. It is rather difficult to cover the whole parallel code with tests. It is often possible to detect race conditions only when it occurred in the given program session. That is, if a dynamic analysis means informs you that there are no errors yet you cannot be sure about it.

In case of parallel applications static analysis is very complicated and it is often impossible to predict the program's behavior as the acceptable set of input values for different functions and the way of calling them are unknown. These values can be predicted on the basis of the rest code but within certain limits as a huge space of possible states appears and the size of the tested information (variants) becomes unacceptably large. Besides, static analysis means often display a lot of false messages about potential errors and demand great effort to minimize their number.

In practice model-based test works only for small base blocks of an application. In most cases it is very difficult to automatically build a model on the basis of the code while manual creation of models is a very laborious and resource-intensive operation. In fact, it is necessary to write the same data conversion algorithms but in a different presentation. As in case of static analysis a problem of quick extension of state space appears. State space extension can be partially controlled by using methods of reduction with complex heuristic logic but it leads to that some defects will be missed. An example of a tool for testing parallel application projects on the basis of models is Zing.

As we see, using only one of the approaches to testing parallel applications is inefficient and it would be a good solution to use several methods for testing one and the same program product.

3. New technologies - new tools

As the choice was made for benefit of multicore processors when solving the task of performance increase this resulted in a new round of development toolkit advance. For multicore systems with common memory it is more convenient to use such programming technologies as OpenMP instead of more habitual MPI or standard paralleling means provided by operational systems (fork, beginthread).

New technologies need new tools for supporting them. Now we should attentively follow the new systems of parallel programming support appearing on software market. They can significantly simplify your work and help you to quicker adapt your applications for efficient use of the parallel environment.

As was said before static analysis of parallel programs is complicated and ineffective as it is necessary to store too much information about the possible program states. This is absolutely true in case of using such parallel programming technologies as MPI or paralleling by operational systems means.

It is better in case of OpenMP technology and you can often implement efficient static analysis possessing good characteristics. This is due to that OpenMP technology is meant for paralleling isolated code sections. OpenMP, so to say, allows you to make a program parallel "in small parts" by putting special directions in the most performance-critical code sections. As the result the parallel code appears to be grouped and it doesn't depend on other application's parts what allows you to perform a high-quality analysis.

Until recently the area of static analysis of OpenMP programs has not practically been developed. We can give only one example, that is Sun Studio compiler performing rather good diagnosis. VivaMP static analyzer has occupied this sphere. This is a specialized tool for searching errors in parallel programs developed with the use of OpenMP technology in C and C++ languages [5].

This analyzer both detects explicit errors and informs about potentially unsafe code. To show the process of error diagnosis let's consider an example of detecting use of one variable for saving from parallel streams without necessary synchronization:

int sum1 = 0;
int sum2 = 0;
#pragma omp parallel for 
  for (size_t i = 0; i != n; ++i)
  {
    sum1 += array[i]; // V1205
    #pragma atomic
    sum2 += array[i]; //Fine
  }

And to show the process of potentially unsafe code diagnosis let's consider an example of using flush for a pointer. It's easy to make a mistake and forget that it is the pointer to which flush operation will be applied and not the data it refers to. As the result the given code can be both correct and incorrect:

int *ptr;
...
#pragma omp flush(ptr) // V1202
int value = *ptr;

In the next VivaMP analyzer's version some tests will be implemented which relate to detecting ineffective parallel code. For example, there will appear critical sections where it would be only enough to use the quicker directive atomic:

#pragma omp critical
{
  a++;  //Slow
}
#pragma omp atomic
a++; //Good

Conclusion

Different forms of paralleling have existed in the software world for a long time. But to create mass commercial applications which would use the possibilities of multicore processors to full extend we need other development means different from those used when creating sequential applications. We hope that this article will throw light on all those difficulties relating to development of parallel applications and the programmer will be most serious when choosing appropriate means of developing and testing such programs.