>
>
On one of the code quality metrics

Evgenii Ryzhkov
Articles: 125

On one of the code quality metrics

There are different metrics used in programming, including metrics for code equality estimation. One of these is the error density metric. One might think it enables one to find out exactly whether or not a certain code is quality. Is it so?

The density of errors in code is quite simple to calculate. You need to take the number of errors and divide it by the number of code lines. For example, if the code contains 6 errors per 100 lines, the error density is 6/100=0.06. This is surely a rather poor code - at the level of a lab work by a novice student.

When using static code analysis one feels the urge to use this metric regularly. Assume, for instance, a static code analyzer generates 10 messages per 1000 code lines. Does it tell you anything about the quality of the code being checked? Unfortunately, it tells you not that much as you might think. Let's see why.

Consider a simple example containing two potential buffer overflows:

int _tmain(int argc, _TCHAR* argv[]) {
  char buf[4];

  printf("Enter your name: ");
  scanf("%s", buf);
  printf("Name: %s\n", buf);

  printf("Enter your surname: ");
  scanf("%s", buf);
  printf("Surname: %s\n", buf);

  return 0;
}

Buffer overflows are possible (and will most likely occur) in the lines with scanf. The error density is: 2/10=0.2.

But if we take out the reading operation as a single function, the "errors" will become fewer, i.e. we'll have only one:

void my_scanf(char buf[]) {
  scanf("%s", buf);
}

int _tmain(int argc, _TCHAR* argv[]) {
  char buf[4];

  printf("Enter your name: ");
  my_scanf(buf);
  printf("Name: %s\n", buf);

  printf("Enter your surname: ");
  my_scanf(buf);
  printf("Surname: %s\n", buf);

  return 0;
}

Although the program has become no safer, the error density is lower almost twice: 1/13= 0.08! At the same time, those two potential buffer overflows are still there in the code.

Of course, it doesn't mean that the error density metric is useless. The mentioned effect will be compensated for in large code amounts. But remember to be careful with this metric.