TDD is one of the most popular software development techniques. I like this technology in general, and we employ it to some extent. The main thing is not to run to extremes when using it. One shouldn't fully rely on it alone forgetting other methods of software quality enhancement. In this article, I will show you how the static code analysis methodology can be used by programmers using TDD to additionally secure themselves against errors.
Test-driven development (TDD) is a technique of software development based on iteration of very short development cycles. You write a test first which covers the change you want to introduce, then you write a code to pass the test, and finally you carry out refactoring of the new code to meet the corresponding standards. I won't dwell on what TDD is: there exist many articles on this subject which you can easily find on the Internet.
I think it's especially important not to let yourself be carried away by creating numerous tests when using TDD. Tests allow you to show a delusive whirl of activity writing a huge number of code lines per day. But at the same time the product's functionality will grow very slowly. You may spend almost all your effort and time on writing test codes. Moreover, tests are sometimes labor-intensive to maintain when the functionality changes.
That's why we don't use TDD in its pure form when developing PVS-Studio. If we write tests for individual functions, the development time will grow several dozens of times. The reason is this: to call a function expanding a type in typedef or perform some code analysis, we have to prepare quite a lot of input data. We also need to build a correct fragment of the parse tree in memory and fill a lot of structures. All this takes too much time.
We use another technique. Our TDD tests are small C/C++ code fragments marked in a special way. At first we write various situations where certain warnings are to be generated. Then we start implementing the code to detect them. In rough outline, these tests look something like this:
int A() {
int x;
return x; //Err
}
This test checks that the program generates a warning about the use of an uninitialized variable. This error doesn't exist at first, of course. We implement the diagnostic and then add new tests for unique situations.
int B() {
static int x;
return x; //Ok
}
All is good here, as the variable is a static one.
This is, of course, not a canonical way of using TDD. But it's the result which is important, not the form, isn't it? The idea is the same: we start with a set of tests which are not passed; then implement the diagnostic, write new texts, carry out refactoring, and so on.
TDD in its pure form cannot be used everywhere. For example, such is our case. If you want to use this methodology, but it's not convenient to you, try to look at it from a higher abstraction level. We think we've managed that.
If you use a huge number of tests, it may give you a false sense of safety, which makes programmers reduce the code quality control. TDD allows you to detect many defects at the development stage - but never all of them. Don't forget the other testing methodologies.
When studying the source codes of many open-source applications, I constantly notice the same two drawbacks of unit-test usage. TDD does have other ones, but I won't speak on them now. At least, they don't attract my attention that much.
So, these are the two typical problems when making tests:
1) Tests themselves are not tested.
2) Tests don't check rare critical cases.
Writing tests for tests is really too much. But we should keep in mind that a test is a program code too, and errors may occur there as well. There are frequent cases when tests only pretend to check something.
What to do? You should use additional tools for code quality control, at least. These may be dynamic or static code analyzers. They don't guarantee detection of all the errors in tests, of course, but the use of various tools in a complex produces very good results.
For example, I often come across errors in test codes when running PVS-Studio to check a new project. Here is an example taken from the Chromium project.
TEST(SharedMemoryTest, MultipleThreads) {
....
int threadcounts[] = { 1, kNumThreads };
for (size_t i = 0;
i < sizeof(threadcounts) / sizeof(threadcounts); i++) {
....
}
Some of the tests must be launched in one thread and then in several threads. Because of a misprint, the parallel algorithm work is not tested. The error is here: sizeof(threadcounts) / sizeof(threadcounts).
The following principle will to a large extent secure you against mistakes in tests. A freshly written test mustn't be passed: it helps you make sure that the test really checks something. Only after that you may start implementing the new functionality.
However, it doesn't prevent errors in tests all the times. The code shown above won't be passed at first too, since the error is only in the number of parallel threads to be launched.
We have some more examples. A typical mistake when comparing buffers is mixing up pointer sizes and buffer sizes: quite often the pointer size is calculated instead of the buffer size. These errors may look something like this:
bool Test()
{
char *buf = new char[10];
FooFoo(buf);
bool ok = memcmp(buf, "1234567890", sizeof(buf)) == 0;
delete [] buf;
return ok;
}
This test works "by half": it compares only the first 4 or 8 bytes. The number of bytes being compared depends on the pointer size. This test may look good and correct but don't trust it.
Another weak point of TDD is absence of tests for critical situations. You can create these tests, of course. But it is unreasonably labor-intensive. For instance, it will take you many efforts to make malloc() return NULL when needed, while its use is very little. The probability of this situation may be lower than 0.0001%. So you have to make a compromise between the tests' fullness and laboriousness of their implementation.
Let's play with numbers a bit. Assume the malloc() function is used 1000 times in the code. Let the probability of memory shortage when calling each of them is 0.0001%. Let's calculate the probability of the memory allocation error when executing the program:
(1 - 0.999999^1000) * 100% = 0.09995%
The memory shortage probability is approximately 0.1%. It's wasteful to write 1000 tests for these cases. On the other hand, 0.1% is not that little. Some users will definitely have them. How to make sure they will be correctly handled?
That's a difficult question. Writing unit-tests is too expensive. Dynamic analyzers are not suitable for the same reasons: they require that you create a situation when the program lacks memory at certain moments. Manual testing goes without mentioning.
There are two ways. You may use special tools returning the error code when calling certain system functions. I never dealt with these systems myself, so I can't say how much simple, efficient and safe they are.
Another way is to use the static code analyzer. This tool doesn't care how often this or that program branch is executed: it checks almost the whole code. The word "almost" means that C/C++ programs may contain "#ifdef" and explicitly disabled branches (through "if(0)") about whose contents we'd better not speak.
Here is an example of a bug detected through static analysis in error handlers:
VTK_THREAD_RETURN_TYPE vtkTestCondVarThread( void* arg )
{
....
if ( td ) // <=
{
....
}
else
{
cout << "No thread data!\n";
cout << " Thread " << ( threadId + 1 )
<< " of " << threadCount << " exiting.\n";
-- td->NumberOfWorkers; // <=
cout.flush();
}
...
}
If the error occurs, the message is generated and the variable "td->NumberOfWorkers" gets modified. One mustn't do it because the 'td' pointer equals zero.
This is my summary of the article:
1. TDD is a wonderful technology. You should spend some time on studying it and start using it in your work. If the classic TDD doesn't suit you, don't abandon this methodology right away. Perhaps you will be able to use it if you consider using it a bit differently or at a higher abstraction level.
2. Don't go mad about it. Ideal methodologies don't exist. Tests check far not all the code in practice, and tests themselves are also error-prone. Use other testing methods: load testing, static code analysis and dynamic code analysis.