Let the world tremble! We've released PVS-Studio 4.00 with a general-purpose analyzer!

Programmers, meet a new tool to search for errors in source code of software written in C/C++. Within the scope of the PVS-Studio analyzer, we implemented a new set of general-purpose rules.

You may download PVS-Studio here http://www.viva64.com/en/pvs-studio/download/.

The article briefly describes new features of PVS-Studio and demonstrates the usage of new diagnostic capabilities with the example of static analysis of the TortoiseSVN project's source code.

PVS-Studio is a state-of-the-art source code analyzer integrated into Microsoft Visual Studio 2005/2008/2010 environment. The analyzer lets you conveniently handle the warning list and enables multi-core processing for analysis. PVS-Studio is meant for developers of modern Windows-software in C/C++/C++0x.

Until now, PVS-Studio included two rule sets. The first was intended for detecting issues in 64-bit software while the second was intended for detecting issues in parallel software based on the OpenMP technology.

Now our analyzer has a third universal set of rules that allows detecting various errors in code. We are only starting our way in the sphere of general-purpose static analysis.

Currently you may study the new set of rules by downloading PVS-Studio 4.00 BETA. We will be glad to receive feedback from you concerning bugs and your wishes of how we could improve the tool. I would like to note it right away that we implemented only 50 general-purpose rules for a start. It is not too much, so if you fail to find any interesting issues in your code after you have downloaded and tried PVS-Studio, please do not jump to conclusions. We suggest that you try PVS-Studio in future when the set of diagnoses is greatly extended. We intend to significantly enlarge the base of diagnostic rules soon (if we have enough health and luck).

Let us demonstrate the use of PVS-Studio's new rule set by the example of TortoiseSVN. TortoiseSVN is a client for the Subversion version control system implemented as a Windows plugin. TortoiseSVN is well known by many developers, so I think there is no need to describe this application in detail. I only want to note that TortoiseSVN was acknowledged as the best project in the category "tools and utilities for developers" in year 2007 on SourceForge.net.

Step 1

Download PVS-Studio from OOO "Program Verification Systems" company's site (that's us). I hope you will appreciate that you do not have to fill in any forms or solve captchas. Simply download the tool.

Step 2

Install PVS-Studio. Do not hesitate to click the "Next" button because you do not have to set anything. The PVS-Studio package is signed with a digital signature. However, some antiviruses might get alerted about integration of PVS-Studio into Visual Studio. So you should allow any activity if asked by your antivirus.

Step 3

Download the package of source code of the TortoiseSVN project. We employed version 1.6.11 of source code.

Step 4

Open the TortoiseSVN project and launch the analysis by choosing the Check Solution command in the PVS-Studio menu.

Step 5

The analyzer will think a bit (the TortoiseSVN project is rather complex and includes a lot of files), so do not perform any actions and just wait for a while. The progress dialogue will appear soon and the analysis will start. The analysis' speed depends upon the number of processor cores in your computer. If PVS-Studio consumes too many resources, you may restrain its appetite in the settings.

The analyzer generates messages in its own window that has controls for enabling/disabling different types of messages. We will use these capabilities because we are not interested in a large number of errors related to 64 bits now.

In the window, you may see a group of three buttons responsible for displaying messages of the three rule sets.

1) 64 is responsible for displaying diagnostic messages about 64-bit issues (Viva64);

2) MP shows diagnostic messages about parallel defects (VivaMP);

3) GA shows the General Analysis diagnostic messages.

Now we are interested only in the general analysis rule set. Uncheck the other buttons and the unnecessary messages will be also hidden in the list.

Wait for the analysis to finish.

Step 6

Analysis is over and we may see the list of all the fragments in the program where code review is necessary.

All the warnings are arranged according to 3 priority levels (this is a new feature in PVS-Studio 4.00). Usually you have to review only messages of the 1-st and 2-nd levels. PVS-Studio 4.00 BETA generated 33 warnings of the 1-st level, 14 warnings of the 2-nd level and 8 warnings of the 3-rd level.

You'd better start examining the messages with the first level warnings. So you may uncheck the button responsible for displaying messages of the second level. The third level is disabled by default.

Step 7

Let's examine interesting code fragments the analyzer has found.

Case 1

In the beginning, there are two messages at once that refer to the same function. I hope that this function is not used too often in the code.

V530 The return value of function 'empty' is required to be utilized. contextmenu.cpp 434

V530 The return value of function 'remove' is required to be utilized. contextmenu.cpp 442

STDMETHODIMP CShellExt::Initialize(....)
{
  ...
  ignoredprops.empty();
  for (int p=0; p<props.GetCount(); ++p)
  {
    if (props.GetItemName(p).
          compare(SVN_PROP_IGNORE)==0)
    {
      std::string st = props.GetItemValue(p);
      ignoredprops = UTF8ToWide(st.c_str());
      // remove all escape chars ('\\')
      std::remove(ignoredprops.begin(),
                  ignoredprops.end(), '\\');
      break;
    }
  }
  ...
}

Here and further we will give only brief comments on code fragments. To learn more why these code fragments are considered unsafe, refer to PVS-Studio's online-documentation. The PVS-Studio distribution package also includes documentation in the pdf format (it is absolutely the same as the online-documentation). Further we will give links to the corresponding descriptions of the diagnostic messages.

The V530 message warns us that "ignoredprops.empty()" does not clear the string at all while "std::remove()" will never remove the characters.

Case 2

Here it is checked whether a variable of the 'char' type is above or equal to value 0x80.

V547 Expression 'c >= 0x80' is always false. The value range of signed char type: [-128, 127]. pathutils.cpp 559

CString CPathUtils::PathUnescape (const char* path)
{
  // try quick path
  size_t i = 0;
  for (; char c = path[i]; ++i)
    if ((c >= 0x80) || (c == '%'))
    {
      // quick path does not work for
      // non-latin or escaped chars
      std::string utf8Path (path);
      CPathUtils::Unescape (&utf8Path[0]);
      return CUnicodeUtils::UTF8ToUTF16 (utf8Path);
    }
  ...
}

The V547 message tells you that such a check is meaningless. A 'char'-value is always below 0x80, so the condition above is always false. Perhaps it is because of this error that developers left the comment "quick path does not work for non-latin or escaped chars". Surely it does not, but not because of the code failing to convert the string: when it encounters a non-latin character, we simply cannot get inside the 'if' operator's body.

Case 3

Many threads are spawned and terminated by functions CreateThread/ExitThread. Therefore we risk quickly overflowing a thread's stack or failing to release some resources when a thread is terminated. For more information, refer to the V513 warning's description. It is much safer to use functions _beginthreadex() and _endthreadex() for these purposes.

There is no need to give the code sample here, and the text of all the messages is the same:

V513 Use _beginthreadex/_endthreadex functions instead of CreateThread/ExitThread functions. crashhandler.cpp 379

Case 4

It refers to TortoiseSVN companion utilities. It is highly probable that when you handle CrashLog, you will encounter another Crash.

V510 The 'printf_s' function is not expected to receive class-type variable as fourth actual argument. excprpt.cpp 199

string CExceptionReport::getCrashLog()
{
  ...
  _tprintf_s(buf, _T("%s\\%s.xml"),
  getenv("TEMP"), CUtility::getAppName());
  ...
}

The V510 message warns you that it is a bad thing to pass an argument of the std::string type into the printf_s function. But it is std::string that the CUtility::getAppName() function returns. The error here is that the programmer forgot to write ".c_str()". This may result either in an incorrect data output or program crash.

Case 5

The developers intended to clear an array here but failed.

V530 The return value of function 'empty' is required to be utilized. mailmsg.cpp 40

CMailMsg& CMailMsg::SetFrom(string sAddress,
                            string sName)
{
   if (initIfNeeded())
   {
      // only one sender allowed
      if (m_from.size())
         m_from.empty();
      m_from.push_back(TStrStrPair(sAddress,sName));
   }
   return *this;
}

Again, the V530 message tells us that it is "empty()" which is accidentally written instead of "clear()".

Case 6

In the SetTo() function, the developers also failed to clear an array.

V530 The return value of function 'empty' is required to be utilized. mailmsg.cpp 54

CMailMsg& CMailMsg::SetTo(string sAddress,
                          string sName)
{
   if (initIfNeeded())
   {
      // only one recipient allowed
      if (m_to.size())
         m_to.empty();

      m_to.push_back(TStrStrPair(sAddress,sName));
   }
   return *this;
}

Case 7

Of course the analyzer generates false alarms as well. For instance, this code fragment from the zlib library is included into the TortoiseSVN project. There is no error here but it is rather helpful to mark such a fragment with the V501 warning.

V501 There are identical sub-expressions to the left and to the right of the '-' operator: size - size zutil.c 213

voidpf zcalloc (opaque, items, size)
    voidpf opaque;
    unsigned items;
    unsigned size;
{
  /* make compiler happy */
  if (opaque) items += size - size;
  return (voidpf)calloc(items, size);
}

Of course the compiler feels happy here, but the subtraction operation looks suspicious.

Case 8

There are other grey areas concerning encodings. Here is one more condition which is always false.

V547 Expression '* utf8CheckBuf >= 0xF5' is always false. The value range of signed char type: [-128, 127]. tortoiseblame.cpp 312

BOOL TortoiseBlame::OpenFile(const TCHAR *fileName)
{
  ...
  // check each line for illegal utf8 sequences.
  // If one is found, we treat
  // the file as ASCII, otherwise we assume
  // an UTF8 file.
  char * utf8CheckBuf = lineptr;
  while ((bUTF8)&&(*utf8CheckBuf))
  {
    if ((*utf8CheckBuf == 0xC0)||
        (*utf8CheckBuf == 0xC1)||
        (*utf8CheckBuf >= 0xF5))
    {
      bUTF8 = false;
      break;
    }
   ...
  }

By the way, conditions "*utf8CheckBuf == 0xC0" and "*utf8CheckBuf == 0xC1" are always false too. That is why the code "bUTF8 = false;" will never get control. The fact that the PVS-Studio analyzer kept silent about the "*utf8CheckBuf == 0xC0" expression is its drawback. We have noted it and will make the analyzer to scold this issue in the next version.

Case 9

The next message is not so simple: in theory we have an error but in practice everything works well.

V507 Pointer to local array 'stringbuf' is stored outside the scope of this array. Such a pointer will become invalid. mainwindow.cpp 277

LRESULT CALLBACK CMainWindow::WinMsgHandler(....)
{
  ...
  if (pNMHDR->code == TTN_GETDISPINFO)
  {
    LPTOOLTIPTEXT lpttt;

    lpttt = (LPTOOLTIPTEXT) lParam;
    lpttt->hinst = hResource;

    // Specify the resource identifier of the
    // descriptive text for the given button.
    TCHAR stringbuf[MAX_PATH] = {0};
    ...
    lpttt->lpszText = stringbuf;
  }
  ...
}

The V507 message warns you that the object is being used after it has been destroyed. The 'stringbuf' buffer will be used after exiting the 'if' operator's body.

If 'stringbuf' was a class object (for instance, of std::string class), the code would behave incorrectly. We would use an already destroyed object in that case. But here 'stringbuf' is an array created in the stack. The Visual C++ compiler does not use this stack fragment once again, so the buffer will continue to exist until the 'CMainWindow::WinMsgHandler' function terminates. Thus, there is no error yet this code is potentially dangerous.

Case 10

Here is one more fragment like the previous one. Again we have code that works but it is rather fragile.

V507 Pointer to local array 'stringbuf' is stored outside the scope of this array. Such a pointer will become invalid. picwindow.cpp 443

if ((HWND)wParam == m_AlphaSlider.GetWindow())
{
  LPTOOLTIPTEXT lpttt;

  lpttt = (LPTOOLTIPTEXT) lParam;
  lpttt->hinst = hResource;
  TCHAR stringbuf[MAX_PATH] = {0};
  _stprintf_s(stringbuf, .....);
  lpttt->lpszText = stringbuf;
}

Case 11

It is a bad idea to throw exceptions and not to handle them in the destructor.

V509 The 'throw' operator inside the destructor should be placed within the try..catch block. Raising exception inside the destructor is illegal. cachefileoutbuffer.cpp 52

CCacheFileOutBuffer::~CCacheFileOutBuffer()
{
  if (IsOpen())
  {
    streamOffsets.push_back (GetFileSize());
    size_t lastOffset = streamOffsets[0];
    for (size_t i = 1,
         count = streamOffsets.size();
         i < count; ++i)
    {
      size_t offset = streamOffsets[i];
      size_t size = offset - lastOffset;

      if (size >= (DWORD)(-1))
        throw CStreamException("stream too large");

      Add ((DWORD)size);
      lastOffset = offset;
    }

    Add ((DWORD)(streamOffsets.size()-1));
  }
}

The V509 message warns you that if the CcacheFileOutBuffer object is destroyed while an exception is being handled, a new exception will cause a program crash.

Case 12

One more meaningless comparison.

V547 Expression 'endRevision < 0' is always false. Unsigned type value is never < 0. cachelogquery.cpp 999

typedef index_t revision_t;
...
void CCacheLogQuery::InternalLog (
   revision_t startRevision
 , revision_t endRevision
 , const CDictionaryBasedTempPath& startPath
 , int limit
 , const CLogOptions& options)
{
  ...
  // we cannot receive logs for rev < 0
  if (endRevision < 0)
    endRevision = 0;
  ...
}

There simply cannot be any negative values here. The endRevision variable has the unsigned type and therefore endRevision is always above or equal to 0. The potential issue here is that if a negative value has been cast to unsigned type somewhere earlier, we will start handling a very large number. This cannot be checked.

Case 13

There are no more useful messages of the first level. Well, for now. It is only the first step in trying PVS-Studio's capabilities. We intend to develop not fewer than 150 issues we must teach our tool to detect. I would like to thank once again those readers who responded to our previous articles and sent us samples where one can detect errors with the help of static analysis at the stage of writing the code.

Let's look at the second level. We found one interesting error related to using the Copy-Paste method. By the way, there was nothing interesting at the third level, so it is not for nothing that we disable it by default.

V524 It is odd that the 'GetDbgHelpVersion' function is fully equivalent to the 'GetImageHlpVersion' function (SymbolEngine.h, line 98). symbolengine.h 105

BOOL GetImageHlpVersion(DWORD &dwMS, DWORD &dwLS)
{
  return(GetInMemoryFileVersion(("DBGHELP.DLL"),
                                dwMS,
                                dwLS)) ;
}

BOOL GetDbgHelpVersion(DWORD &dwMS, DWORD &dwLS)
{
  return(GetInMemoryFileVersion(("DBGHELP.DLL"),
                                dwMS,
                                dwLS)) ;
}

The V524 message is generated if the analyzer finds two suspiciously similar functions. It is most likely that the first function must get the "imagehlp.dll" version of the file instead of "dbghelp.dll ".

Step 8

Now we must fix the errors we have found. This step is clear and we will skip it.

Concerning the errors found, we will report them to TortoiseSVN's developers.

Step 9

Now let's speak a bit about false alarms. Let me give you some examples to explain what false alarms are and how we can deal with them.

Here is the first false alarm: PVS-Studio did not understand the playing with the operation of memory copying.

V512 A call of the 'memcpy' function will lead to a buffer overflow or underflow. resmodule.cpp 838

const WORD*
CResModule::CountMemReplaceMenuExResource(....)
{
  ...
  if (newMenu != NULL) {
    CopyMemory(&newMenu[*wordcount], p0, 7 * sizeof(WORD));
  }
   ...
}

The V512 warning informs you that we have a buffer underflow or, vice versa, a buffer overflow. The analyzer made a mistake this time having suggested that we want to copy 7 objects but intend to handle only one object of the WORD type.

Here is the second false alarm. The analyzer suggests that we processed only a part of the array.

V512 A call of the 'memcmp' function will lead to a buffer overflow or underflow. sshsha.c 317

static int sha1_96_verify(....)
{
  unsigned char correct[20];
  sha1_do_hmac(handle, blk, len, seq, correct);
  return !memcmp(correct, blk + len, 12);
}

Right, it is only a part of the 'correct' array participating in the comparison operation but we made it intentionally.

The third example of a false alarm.

V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. tree234.c 195

static void *add234_internal(....)
{
  ...
  if ((c = t->cmp(e, n->elems[0])) < 0)
    childnum = 0;
  else if (c == 0)
    return n->elems[0]; /* already exists */
  else if (n->elems[1] == NULL
       || (c = t->cmp(e, n->elems[1])) < 0)
         childnum = 1;
  else if (c == 0)
    return n->elems[1]; /* already exists */
  else if (n->elems[2] == NULL
       || (c = t->cmp(e, n->elems[2])) < 0)
         childnum = 2;
  else if (c == 0)
    return n->elems[2]; /* already exists */
  else
    childnum = 3;
  ...
}

The analyzer does not like that the check 'c == 0' is present in code several times. The code is correct since the 'c' variable is changed inside the other conditions "c = t->cmp(e, n->elems[2])". But this situation is rather rare. Usually the V517 message points to real defects in code.

We will not consider all the rest of false alarms because there is nothing interesting about them. A programmer can easily understand that they are false alarms and he does not have to examine them too closely.

You may handle false alarms in several ways:

1) You may rewrite the code. Sometimes it is rather reasonable. Refactoring would be very helpful for the last sample with a false alarm (I mean the add234_internal function and warning V517).

2) You may disable some diagnoses in the settings which always produce false alarms in your projects. After you disable them, all the corresponding messages will disappear from the warning list. For more details, see "Settings: Detectable Errors".

3) If false alarms refer to code that does not need to be checked, you may exclude separate files or folders from analysis. You may also use masks. For details, see "Settings: Don't Check Files". This method is convenient for excluding third-party libraries from analysis.

4) You may use the mechanism of suppressing messages containing particular text. For details, see "Settings: Message Suppression".

5) There are cases when you should suppress a particular false alarm. Then you may use the "Mark as False Alarm" option. If you use it, the analyzer adds a small comment of the "//-Verror_code" kind into the code. You may hide code fragments marked with this comment by the FA button that enables or disables displaying the marked messages. For details, see: "False alarm suppression".

Thank you for your attention. Try PVS-Studio. Send us your feedback. Ask us questions. Give us your interesting samples. Offer new diagnostic rules we could implement.

Sincerely yours, Andrey Karpov, one of the PVS-Studio's developers.

You may contact us on page "Feedback".

Or by e-mail: support[@]viva64.com , karpov[@]viva64.com.

Write to us, we have a wonderful support service. It is me, the author of this article, who will personally participate in communication unlike most companies where you deal with some abstract human-robot.