To get a trial key
fill out the form below
Team License (standard version)
Enterprise License (extended version)
* By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
USD
EUR
GBP
RUB
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
Static analysis of source code by the e…

Static analysis of source code by the example of WinMerge

Oct 30 2010
Author:

The today's post is devoted to the question why tools of static source code analysis are helpful regardless of programmer's knowledge and skill. I will demonstrate the benefit of static analysis by the example of the tool known to every programmer - WinMerge.

0082_WinMerge/image1.png

The earlier the developer finds an error in application code, the cheaper it is to fix it. From this we conclude that it's cheapest and easiest to eliminate an error while writing the code. The best way is certainly just to write without errors at all: imagine that you are only going to make an error but you slap your hand with the other and go on writing correct code. Still we don't manage to do that, do we? So, the approach "you should write without errors" doesn't work anyway.

Even a highly skilled programmer who takes his time makes errors from common misprints to errors in algorithms. It is the law of large numbers that works in this case. Does it seem to you that one can't make a mistake in every particular "if" operator? But I carried out such an experiment and wrote 200 comparisons - I did make an error once. Andrey Urazov discussed this thing in his interesting lecture "Quality-oriented programming" at the CodeFest 2010 conference. I would like to cite his thought that however skilled developers are, errors will appear in code all the same. You just can't stop making them. But you may successfully fight many of them at much earlier stages of development process than usually.

Usually the first level of error defense is creating unit-tests for the newly written code. Sometimes tests are written earlier than the code they are intended to check. However, unit-tests have some disadvantages that I'm not going to discuss in detail here because all the programmers are aware of them. It is not always easy to create a unit-test for a function that requires a complicated procedure of preliminarily preparing the data. Unit-tests become a burden if the project requirements change rapidly; tests consume a lot of time to write and support; it is not always easy to cover all the program branches with tests, etc. Moreover, you may get a solid project "as a present" that merely has no unit-tests and they were not intended at all. Without denying the great benefit of unit-tests, I still think that although it is a good defense level, we can and must improve it greatly.

Programmers usually neglect an even earlier defense level - static code analysis. Many developers utilize capabilities of static code analysis without leaving the scope of diagnostic warnings generated by compilers. However, there is a wide range of tools that allow you to detect a significant part of logical errors and common misprints at the coding stage already. These tools perform a higher-level code check relying on knowledge of some coding patterns, use heuristic algorithms and provide for a flexible settings system.

Of course, static analysis has its own disadvantages: it just cannot detect many types of errors; analyzers produce false alarms and make you modify code so that they like it and consider safe.

But there are huge advantages as well. Static analysis covers all the program branches regardless of how often they are used. It doesn't depend on execution stages. You may check even incomplete code or you may check a large amount of code you inherited from some developer. Static analysis is quick and well scalable unlike dynamic analysis tools.

So you have read a lot of words about static analysis of source code. Now it's time for practice. I want to take one application in C++ and try to find errors in it.

I wanted to choose something small and widely known. Since I don't use too many tools, I just looked through the "Programs" list in the "Start" menu and decided to take WinMerge. The WinMerge application is open-source and it is small (about 186000 lines). Its quality is rather high. I'm saying this relying on my experience - I have no complaints about it and I like that comments occupy 25% of its source code (it is a good sign). So, it is a good choice.

I downloaded the latest available version 2.13.20 (from 20.10.2010). I used the prototype of a general-purpose analyzer we are developing now. Let me tell you a bit more about it.

Currently, the PVS-Studio static analyzer includes two rule sets. One of them is intended to detect 64-bit defects and the other is intended to check OpenMP programs. Now we are developing a general-purpose set of rules. We haven't got even a beta-version yet but some code already works and I'm very eager to have a real war against errors. We intend to make the new rule set free, so please don't write that we are indulging in self-advertisement. The new tool will be presented to the community in 1-2 months as a part of PVS-Studio 4.00.

So, here are some interesting issues I detected in WinMerge-2.13.20's code during a half an hour (15 minutes for analysis, 15 minutes to review the results). There are also some other suspicious fragments but it demands some efforts to make it out if they are really errors or not. My current task is not to find as many defects in one project as possible; I just want to make a nice demonstration of benefits static analysis provides and show how to quickly detect some errors through even superficial examination.

The first sample. The analyzer pointed to several errors "V530 - The return value of function 'Foo' is required to be utilized". These warnings are usually generated for inappropriately used functions. Study this code fragment:

/**
* @brief Get the file names on both sides for specified item.
* @note Return empty strings if item is special item.
*/
void CDirView::GetItemFileNames(int sel,
  String& strLeft, String& strRight) const
{
  UINT_PTR diffpos = GetItemKey(sel);
  if (diffpos == (UINT_PTR)SPECIAL_ITEM_POS)
  {
    strLeft.empty();
    strRight.empty();
  }
  else
  {
     ...
  }
}

The function must return two empty strings in a particular case. But due to programmer's inattention, it it's the std::string::empty() functions which are called instead of std::string::clear(). By the way, this error is not so rare as it may seem - I encountered it in many other projects. This error is also present in another WinMerge's function:

/**
* @brief Clear variant's value (reset to defaults).
*/
void VariantValue::Clear()
{
  m_vtype = VT_NULL;
  m_bvalue = false;
  m_ivalue = 0;
  m_fvalue = 0;
  m_svalue.empty();
  m_tvalue = 0;
}

Again we don't get the expected clearing of the string.

And here we have the warning "V501 - There are identical sub-expressions to the left and to the right of the '||' operator":

BUFFERTYPE m_nBufferType[2];
...
// Handle unnamed buffers
if ((m_nBufferType[nBuffer] == BUFFER_UNNAMED) ||
    (m_nBufferType[nBuffer] == BUFFER_UNNAMED))
  nSaveErrorCode = SAVE_NO_FILENAME;

If we review the code nearby, we conclude by analogy that we must have the following lines in our fragment:

(m_nBufferType[0] == BUFFER_UNNAMED) ||
(m_nBufferType[1] == BUFFER_UNNAMED)

If it is not so, still there is some error here.

When various crashes occur, WinMerge tries to report about errors but fails in most cases. By the way, it is a good example of how a code analyzer can detect errors in rarely used program fragments. There are several errors in the code PVS-Studio reports about with the following warning: "V510 - The 'Format' function is not expected to receive class-type variable as 'N' actual argument". Study this code sample:

String GetSysError(int nerr);
...
CString msg;
msg.Format(
_T("Failed to open registry key HKCU/%s:\n\t%d : %s"),
f_RegDir, retVal, GetSysError(retVal));

Everything seems good at first. But the "String" type is actually "std::wstring" and therefore we will have some rubbish printed at best, or an access violation error at worst. It is an object of the "std::wstring" type which is put into the stack instead of a string-pointer. Read the post "Big Brother helps you" where I described this error in detail. The correct code must have a call with c_str():

msg.Format(
_T("Failed to open registry key HKCU/%s:\n\t%d : %s"),
f_RegDir, retVal, GetSysError(retVal).c_str());

Let's go further. Here we have a suspicious code fragment. I don't know if there is really an error, but it is strange that two branches of the "if" operator contain absolutely the same code. The analyzer warns about it with the diagnostic message "V532 - The 'then' statement is equivalent to the 'else' statement". Here is this suspicious code:

if (max < INT_MAX)
{
  for (i = min; i < max; i++)
  {
    if (eptr >= md->end_subject ||
        IS_NEWLINE(eptr))
      break;
    eptr++;
    while (eptr < md->end_subject &&
           (*eptr & 0xc0) == 0x80)
      eptr++;
    }
  }
else
{
  for (i = min; i < max; i++)
  {
    if (eptr >= md->end_subject ||
        IS_NEWLINE(eptr))
      break;
    eptr++;
    while (eptr < md->end_subject &&
           (*eptr & 0xc0) == 0x80)
      eptr++;
    }
  }
}

I feel that "this humming is no accident".

OK, let's study one more sample and get finished with the post. The analyzer found a suspicious loop: "V534 - It is likely that a wrong variable is being compared inside the 'for' operator. Consider reviewing 'i'." This is the source code:

// Get length of translated array of bytes from text.
int Text2BinTranslator::iLengthOfTransToBin(
  char* src, int srclen )
{
  ...
    for (k=i; i<srclen; k++)
    {
      if (src[k]=='>')
        break;
    }
  ...
}

This code is inclined to Access Violation. The loop must continue until the '>' character is found or the string with the length of 'srclen' characters ends. But the programmer used by accident the 'i' variable instead of 'k' for comparison. If the '>' character is not found, the consequences are likely to be bad.

Summary

Don't forget about static analysis. It may often help you find some peculiar issues even in good code. I also invite you to visit our site some time later to try our free general-purpose analyzer when it is ready.

Popular related articles
Appreciate Static Code Analysis!

Date: Oct 16 2017

Author: Andrey Karpov

I am really astonished by the capabilities of static code analysis even though I am one of the developers of PVS-Studio analyzer myself. The tool surprised me the other day as it turned out to be sma…
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, 10-15% of False Positives

Date: Jul 31 2017

Author: Andrey Karpov

After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many erro…
The Last Line Effect

Date: May 31 2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…
The Ultimate Question of Programming, Refactoring, and Everything

Date: Apr 14 2016

Author: Andrey Karpov

Yes, you've guessed correctly - the answer is "42". In this article you will find 42 recommendations about coding in C++ that can help a programmer avoid a lot of errors, save time and effort. The au…
The way static analyzers fight against false positives, and why they do it

Date: Mar 20 2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
PVS-Studio ROI

Date: Jan 30 2019

Author: Andrey Karpov

Occasionally, we're asked a question, what monetary value the company will receive from using PVS-Studio. We decided to draw up a response in the form of an article and provide tables, which will sho…
How PVS-Studio Proved to Be More Attentive Than Three and a Half Programmers

Date: Oct 22 2018

Author: Andrey Karpov

Just like other static analyzers, PVS-Studio often produces false positives. What you are about to read is a short story where I'll tell you how PVS-Studio proved, just one more time, to be more atte…
Free PVS-Studio for those who develops open source projects

Date: Dec 22 2018

Author: Andrey Karpov

On the New 2019 year's eve, a PVS-Studio team decided to make a nice gift for all contributors of open-source projects hosted on GitHub, GitLab or Bitbucket. They are given free usage of PVS-Studio s…
Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities

Date: Nov 21 2018

Author: Andrey Karpov

A brief description of technologies used in the PVS-Studio tool, which let us effectively detect a large number of error patterns and potential vulnerabilities. The article describes the implementati…
The Evil within the Comparison Functions

Date: May 19 2017

Author: Andrey Karpov

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want t…

Comments (0)

Next comments
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →
Accept