Unicorn with delicious cookie
Nous utilisons des cookies pour améliorer votre expérience de navigation. En savoir plus
Accepter
to the top
>
>
>
What comments hide

What comments hide

20 Sep 2012

Much is said about good and harm of comments in program code and a single opinion hasn't been worked out yet. However, we've decided to take a look at comments from a different viewpoint. Can comments serve as an indication of hidden errors for a programmer studying the code?

When investigating different projects concerning errors, we noticed that programmers sometimes see defects but cannot find out all their causes. Suspicion falls on the compiler: my colleague has recently discussed this effect in the article "The compiler is to blame for everything". As a result, programmers make crutches in the code and leave some comments. These are often obscene.

We decided it was an interesting subject to investigate. Manual review of files or usual word-by-word search is long and tiresome. That's why we wrote a utility that searches for suspicious comments in ".c" and ".cpp" files relying on its dictionary of "suspicious words". This dictionary includes, for example, such words as fuck, bug, stupid, compiler.

We've got a lot of lines with comments of that kind. Picking out fragments really worth considering was a hard and tiresome task. We have found little of interest - much less than we expected.

The task of our search was to find new patterns of possible mistakes made by programmers. Unfortunately, all the found defects either cannot be diagnosed by static code analysis at all or are already successfully detectable by PVS-Studio.

But a bad result is a result too. Most likely we will come to the conclusion that the method of searching for strange comments is a dead-end. It's too labor-intensive while allowing you to catch too few bugs.

But since the investigation has been carried out, we've decided to show you a couple of examples.

For example, consider this code:

// Search for EOH (CRLFCRLF)
const char* pc = m_pStrBuffer;
int iMaxOff = m_iStrBuffSize - sizeof(DWORD);
for (int i = 0; i <= iMaxOff; i++) {
  if (*(DWORD*)(pc++) == 0x0A0D0A0D) {
    // VC-BUG?: '\r\n\r\n' results in 0x0A0D0A0D too,
    //although it should not!
    bFoundEOH = true;
    break;
  }
}

As you can see from the comment "// Search for EOH (CRLFCRLF)", the programmer wanted to find the sequence of bytes 0D,0A,0D,0A (CR == 0x0D, LF == 0x0A). Since the bytes are arranged in a reverse order, the search constant equals 0x0A0D0A0D.

This program doesn't seem to be quite successful at handling a different sequence of carriage return and line folding. This is the cause of the author's misunderstanding, which is indicated by the comment: " // VC-BUG?: '\r\n\r\n' results in 0x0A0D0A0D too, although it should not!". So why does the algorithm find not only the {0D,0A,0D,0A} sequence, but the {0A,0D,0A,0D} sequence too?

Everything's simple. The search algorithm is moving through the array byte-by-byte. That's why if it comes across a long sequence like {0A,0D,0A,0D,0A,0D,0A,...}, it will skip the first symbol 0A and move on to find quite different things than the programmer wanted.

Unfortunately, such defects are impossible to catch by static analysis.

Here is one more example of strange code:

TCHAR szCommand[_MAX_PATH * 2];
LPCTSTR lpsz = (LPCTSTR)GlobalLock(hData);
int commandLength = lstrlen(lpsz);
if (commandLength >= _countof(szCommand))
{
  // The command would be truncated.
  //This could be a security problem
  TRACE(_T("Warning: ........\n"));
  return 0;
}
// !!! MFC Bug Fix
_tcsncpy(szCommand, lpsz, _countof(szCommand) - 1);
szCommand[_countof(szCommand) - 1] = '\0';
// !!!

In this case "MFC Bug Fix" is absolutely untrue because there is no error in MFC here. The code cannot cause errors being written in this form, but maybe its earlier version contained only this line: '_tcsncpy(szCommand, lpsz, _countof(szCommand) - 1);'. In this case the error did exist. However, you can implement correct string copying in a shorter way:

_tcsncpy(szCommand, lpsz, _countof(szCommand));

Functions like 'strncpy' add the terminal null at the end of the string automatically if the source string is not longer than the value specified in the counter. This is exactly so in our case, as there is a check for this written above. Cases of incorrect string copying are well detectable by PVS-Studio, so we haven't learned anything new.

Conclusion

We haven't managed to find any new error patterns for further including them into the database of errors detected by our static analyzer. However, this is a good experience in investigating alternative methods of software defect detection. We will for some time continue studying comments in new projects we'll get for analysis. We also plan to make some improvements to the search utility:

  • implement a simple syntactic analysis to decrease detections of "uninteresting" lines;
  • extend the dictionary with new expressions.

Perhaps this program can be useful when you "inherit" a large project with a long code history and would like to see what your predecessors didn't like there.

Popular related articles

S'abonner

Comments (0)

close comment form
close form

Remplissez le formulaire ci‑dessous en 2 étapes simples :

Vos coordonnées :

Étape 1
Félicitations ! Voici votre code promo !

Type de licence souhaité :

Étape 2
Team license
Enterprise licence
** En cliquant sur ce bouton, vous déclarez accepter notre politique de confidentialité
close form
Demandez des tarifs
Nouvelle licence
Renouvellement de licence
--Sélectionnez la devise--
USD
EUR
* En cliquant sur ce bouton, vous déclarez accepter notre politique de confidentialité

close form
La licence PVS‑Studio gratuit pour les spécialistes Microsoft MVP
close form
Pour obtenir la licence de votre projet open source, s’il vous plait rempliez ce formulaire
* En cliquant sur ce bouton, vous déclarez accepter notre politique de confidentialité

close form
I want to join the test
* En cliquant sur ce bouton, vous déclarez accepter notre politique de confidentialité

close form
check circle
Votre message a été envoyé.

Nous vous répondrons à


Si l'e-mail n'apparaît pas dans votre boîte de réception, recherchez-le dans l'un des dossiers suivants:

  • Promotion
  • Notifications
  • Spam