To get a trial key
fill out the form below
Team License (a basic version)
Enterprise License (an extended version)
* By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
USD
EUR
GBP
RUB
* By clicking this button you agree to our Privacy Policy statement

Free PVS-Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
Documenting Bugs in Doxygen

Documenting Bugs in Doxygen

Aug 18 2015

In this article, we will speak about the static analysis of the doxygen documentation generator tool. This popular and widely used project, which, as its authors claim, not without reason, has become "the de facto standard tool for generating documentation from annotated C++ sources", has never been scanned by PVS-Studio before. Doxygen scans the program source code and generates the documentation relying on it. Now it's time for us to peep into its source files and see if PVS-Studio can find any interesting bugs there.

0345_Doxygen/image1.png

Introduction

Doxygen is a crossplatform documentation generator tool for writing software reference documentation, supporting multiple programming languages: C++, C, Objective-C, Python, Java, C#, PHP, IDL, Fortran, VHDL, and to some extent D. Doxygen extracts documentation directly from annotated sources and can be also configured to extract the code structure from undocumented source files. The tool supports the HTML, LATEX, man, rtf, and xml formats as its output. Doxygen is used in the projects KDE, Mozilla, Drupal, Pidgin, AbiWorld, FOX toolkit, Torque Game Engine, and Crystal Space.

Preparing for and running the analysis

The latest doxygen source files can be downloaded from github.com/doxygen/doxygen. The repository doesn't originally contain the Visual Studio project files, but since the developers use cmake, you can easily generate them by yourself. I used the program's console version and the "cmake -G "Visual Studio 12"" command to generate a VS 2013 project file. To start the analysis, you just need to click on the Check Solution button in the PVS-Studio tab in Visual Studio.

Discussing diagnostic messages

Before we start talking about the diagnostic messages (warnings) themselves, I'd like to draw your attention to doxygen's coding style. For some reason, the programmer would very often try to fit the code in one line, neglecting spaces between variables and operators, which made the code much less comprehensible. Some fragments had really strange formatting. And sometimes I even came across things like this. I had to format some of the code samples to fit them in the article. That's been said, let's go on to see what interesting bugs PVS-Studio has managed to find in doxygen.

PVS-Studio's diagnostic message: V519 The '* outListType1' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 8326, 8327. util.cpp 8327

void convertProtectionLevel(MemberListType inListType,
                            int *outListType1,
                            int *outListType2)
{
  static bool extractPrivate;
  ....
  switch (inListType)
  {
  ....
  case MemberListType_priSlots:
    if (extractPrivate)
    {
      *outListType1=MemberListType_pubSlots;
      *outListType1=MemberListType_proSlots;      <<<<====
    }
    else
    {
      *outListType1=-1;
      *outListType2=-1;
    }
    break;
  ....
  }
}

In the if statement body, one and the same variable is assigned two values on end. This is surely either a typo or an unfixed copy-pasted line. The else block suggests that the "MemberListType_proSlots" value must be written into "*outListType2". Another error of this kind can be found here: doxygen.cpp 5742 (see the variable 'da->type').

The next warning: V519 The 'pageTitle' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 970, 971. vhdldocgen.cpp 971

QCString VhdlDocGen::getClassTitle(const ClassDef *cd)
{
  QCString pageTitle;
  if (cd == 0) 
    return "";
  pageTitle += cd->displayName();
  pageTitle = VhdlDocGen::getClassName(cd);
  ....
}

Note the assignment operation. This is most likely a typo, and "+=" should be used instead of "=". Speaking about the coding style, there were no spaces between the operators and values in the source code, which made it much harder to read. And that, in its turn, left much more chances for an error to appear as you can't easily spot a missing "+" in an uninterrupted stream of characters. Adding the spaces makes the bug more visible. Another similar error is hidden in the following line:

V519 The 'nn' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 2166, 2167. vhdldocgen.cpp 2167

Passing on to the next message.

PVS-Studio's diagnostic message: V523 The 'then' statement is equivalent to the 'else' statement. docparser.cpp 521

static void checkUndocumentedParams()
{
  ....
  if (g_memberDef->inheritsDocsFrom())
  {
    warn_doc_error(g_memberDef->getDefFileName(),
                   g_memberDef->getDefLine(),
                   substitute(errMsg,"%","%%"));
   }
  else
  {
    warn_doc_error(g_memberDef->getDefFileName(),
                   g_memberDef->getDefLine(),
                   substitute(errMsg,"%","%%"));
  }
  ....
}

The copy-paste programming technique can not only help you save time on writing the code but bring some bugs into it as well. In the sample above, a code line was copied from the if block into the else block but wasn't fixed after the insertion. Every time you use copy-paste, please remember to stick to the rule "Copy once, check thrice".

PVS-Studio's diagnostic message: V523 The 'then' statement is equivalent to the 'else' statement. translator_tw.h 769

class TranslatorChinesetraditional : public Translator
{
public:
  ....
  virtual QCString trGeneratedFromFiles(bool single, ....)
  { 
  ....
  QCString result=(QCString)"?";
  ....
  if (single) result+=":"; else result+=":";
  ....
  }
....
}

Here's another issue similar to the previous one. In the if block, regardless of the condition, one and the same character is added to the result string. I strongly doubt that's what the programmer really intended because the condition itself would otherwise have been meaningless. Again, had this block been split into 4 lines, following the common style, it would not only have looked much neater but made the typo more prominent, too. Interestingly, this construct was copied two times more for further use in functions, the programmer never noticing the bug. So, we've got two more warnings of this kind:

  • V523 The 'then' statement is equivalent to the 'else' statement. translator_tw.h 1956
  • V523 The 'then' statement is equivalent to the 'else' statement. translator_tw.h 1965

PVS-Studio's diagnostic message: V530 The return value of function 'toupper' is required to be utilized. classdef.cpp 1963

void ClassDef::writeDocumentationContents(....)
{
  QCString pageType = " ";
  pageType += compoundTypeString();
  toupper(pageType.at(1));
  ....
}

In this sample, the programmer misunderstood the principle of the toupper function. Perhaps he or she was expecting the function to change the character passed into it to a capital letter. But the function doesn't actually change the character argument, it only returns its capital version. This is how the toupper function is declared in the "ctype.h" header:

int toupper (int __c);

As you can see from the declaration, the argument is received by value, therefore the character passed into the function can't be changed. To avoid errors like this, carefully read the description of the functions you use if you are not much sure about their behavior.

PVS-Studio's diagnostic message: V560 A part of conditional expression is always false: (flags() &!0x0008). qfile_win32.cpp 267

#define IO_Truncate    0x0008  

bool QFile::open(....)
{
  ....
  int length = INT_MAX;
  if ((flags() & !IO_Truncate) && length == 0 && isReadable())
  ....
}

This condition will always be false because inversion of a non-zero value always results in zero. The logical "AND" used after that makes no sense when one of its arguments is zero. As a result, the condition doesn't depend on other parameters. It would be more logical to use the bitwise inversion operator '~' here.

PVS-Studio's diagnostic message: V560 A part of conditional expression is always true: !found. util.cpp 4264

bool getDefs(....)
{
  ....
  bool found=FALSE;
  MemberListIterator mmli(*mn);
  MemberDef *mmd;
  for (mmli.toFirst();((mmd=mmli.current()) && !found);++mmli)
  {
    ....
  }
  ....
}

I'll tell you right off that the found variable doesn't change in the body of the for loop. Because of that, the loop termination condition depends solely on the mmli.current method's result. What's dangerous about this error is that the loop will run from beginning to end all the time regardless of whether or not the required value has been found.

PVS-Studio's diagnostic message: V595 The 'bfd' pointer was utilized before it was verified against nullptr. Check lines: 3371, 3384. dot.cpp 3371

void DotInclDepGraph::buildGraph(....)
{
  ....
  FileDef *bfd = ii->fileDef;
  QCString url="";
  ....
  url=bfd->getSourceFileBase();
  ....
  if (bfd)
  ....    
}

V595 is probably the most frequent warning among all the projects we check. It's just that you don't always think before using a pointer if it can be null, and only remember to make a check after using it a couple of times. But there may be a large bulk of code between the check and the first time the pointer is dereferenced, which makes the error pretty hard to detect. Other warnings of this kind:

  • V595 The 'cd' pointer was utilized before it was verified against nullptr. Check lines: 6123, 6131. doxygen.cpp 6123
  • V595 The 'p' pointer was utilized before it was verified against nullptr. Check lines: 1069, 1070. htmldocvisitor.cpp 1069
  • V595 The 'Doxygen::mainPage' pointer was utilized before it was verified against nullptr. Check lines: 3792, 3798. index.cpp 3792
  • V595 The 'firstMd' pointer was utilized before it was verified against nullptr. Check lines: 80, 93. membergroup.cpp 80
  • V595 The 'lastCompound' pointer was utilized before it was verified against nullptr. Check lines: 410, 420. vhdljjparser.cpp 410
  • V595 The 'len' pointer was utilized before it was verified against nullptr. Check lines: 11960, 11969. qstring.cpp 11960
  • V595 The 'len' pointer was utilized before it was verified against nullptr. Check lines: 11979, 11988. qstring.cpp 11979
  • V595 The 'fd' pointer was utilized before it was verified against nullptr. Check lines: 2077, 2085. doxygen.cpp 2077

PVS-Studio's diagnostic message: V595 The 'lne' pointer was utilized before it was verified against nullptr. Check lines: 4078, 4089. index.cpp 4078

static void writeIndexHierarchyEntries(OutputList &ol, ....)
{
  QListIterator<LayoutNavEntry> li(entries);
  LayoutNavEntry *lne;
  for (li.toFirst();(lne=li.current());++li)
  {
    LayoutNavEntry::Kind kind = lne->kind();
    ....
    bool addToIndex=lne==0 || lne->visible();
    ....
  }
}

I don't usually describe similar warnings because it feels boring. But today I want to discuss one more instance of the V595 message. This time, the loop is only entered if the returned value li.current() (assigned to the Ine pointer) is not equal to NULL. It means that the pointer is guaranteed to be non-null when used inside the loop, which makes the check just not necessary. I felt I should mention this example because the V595 warning generally deals with potential null pointer dereferencing operations, while in this particular case, it revealed an excessive check.

PVS-Studio's diagnostic message: V601 The bool type is implicitly cast to the class type. docsets.cpp 473

struct IncludeInfo
{
  ....
  bool local;
};

void DocSets::addIndexItem(Definition *context,MemberDef *md,
                           const char *,const char *)
{
  QCString decl;
  ....
  IncludeInfo *ii = cd->includeInfo();
  ....
  decl=ii->local;
  ....
}

The analyzer has noticed a strange conversion of bool to the class type. The QCString class lacks an overloaded assignment operator for a bool argument but it does have a constructor with the input parameter of the int type denoting the string length. It is this constructor which is called to create a temporary object when executing this assignment. The compiler will find the constructor with the int-argument and call it, the bool type cast to int in advance. The local variable can only have 2 values: true or false, which corresponds to 1 and 0. The constructor will create a one-character string in the first case and an empty string in the second. In the end, the assignment operator with the argument of the CQString type will be called. A similar yet less evident conversion takes place in the following fragments:

  • V601 The bool type is implicitly cast to the class type. Inspect the fifth argument. context.cpp 2315
  • V601 The bool type is implicitly cast to the class type. Inspect the fifth argument. context.cpp 2675
  • V601 The bool type is implicitly cast to the class type. Inspect the fifth argument. context.cpp 4456

PVS-Studio's diagnostic message: V614 Potentially uninitialized pointer 't' used. vhdlparser.cc 4127

QCString VhdlParser::extended_identifier()
{
  Token *t;
  if (!hasError)
    t = jj_consume_token(EXTENDED_CHARACTER);
  return t->image.c_str();
  assert(false);
}

In this code fragment, an uninitialized pointer may be dereferenced. The original code is poorly formatted, which only makes this bug less visible. I have formatted this code for the article, and it has become much more prominent. Two more bugs of this kind can be found in the following lines:

  • V614 Potentially uninitialized pointer 'tmpEntry' used. vhdlparser.cc 4451
  • V614 Potentially uninitialized pointer 't' used. vhdlparser.cc 5304

PVS-Studio's diagnostic message: V668 There is no sense in testing the 'file' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. outputgen.cpp 47

void OutputGenerator::startPlainFile(const char *name)
{
  ....
  file = new QFile(fileName);
  if (!file)
  ....
}

It's no secret for anyone nowadays that the new operator throws an exception instead of returning nullptr when it fails to allocate memory. The code sample above is kind of a relic from the programming past. Checks like those don't make any sense for modern compilers anymore and can be removed. 3 more checks of this kind:

  • V668 There is no sense in testing the 'expr' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. template.cpp 1981
  • V668 There is no sense in testing the 'n' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. qglist.cpp 1005
  • V668 There is no sense in testing the 'nd' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. qstring.cpp 12099

PVS-Studio's diagnostic message: V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'd' is lost. Consider assigning realloc() to a temporary pointer. qcstring.h 396

class BufStr 
{
public:
  ....
  void resize(uint newlen)
  {
    ....
    m_buf = (char *)realloc(m_buf,m_size);
    ....
  }
private:
  uint m_size;
  char *m_buf;
  ....
}

The analyzer has detected an incorrect use of the "realloc". When failed to allocate memory, "realloc" will return nullptr, rewriting the previous pointer value. To avoid this, we recommend storing the pointer value in a temporary variable before using "realloc". In addition to this one, the analyzer detected a total of 8 similar potential memory leaks:

  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'd' is lost. Consider assigning realloc() to a temporary pointer. qcstring.h 396
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'str' is lost. Consider assigning realloc() to a temporary pointer. growbuf.h 16
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'str' is lost. Consider assigning realloc() to a temporary pointer. growbuf.h 23
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'str' is lost. Consider assigning realloc() to a temporary pointer. growbuf.h 33
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'm_str' is lost. Consider assigning realloc() to a temporary pointer. vhdlstring.h 61
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'shd->data' is lost. Consider assigning realloc() to a temporary pointer. qgarray.cpp 224
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'm_data' is lost. Consider assigning realloc() to a temporary pointer. qgstring.cpp 114
  • V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'm_data' is lost. Consider assigning realloc() to a temporary pointer. qgstring.cpp 145

Conclusion

To sum it up, I'd say the analyzer has done very well. Despite doxygen being a popular and widely used (by both small and large companies) tool, PVS-Studio still has managed to find lots of suspicious fragments in it. I have only discussed the most basic warnings and skipped such dull defects as excessive checks, unused variables, and the like. As I already said in the beginning, I was surprised by the, as I believe, quite careless code formatting in certain fragments.

I wish you neat, clear code and as few bugs as possible. While the former depends solely on the programmer, the analyzer will help you with the latter. You can download and try PVS-Studio from here: /en/pvs-studio/download/

Popular related articles
The way static analyzers fight against false positives, and why they do it

Date: Mar 20 2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
PVS-Studio for Java

Date: Jan 17 2019

Author: Andrey Karpov

In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've co…
Free PVS-Studio for those who develops open source projects

Date: Dec 22 2018

Author: Andrey Karpov

On the New 2019 year's eve, a PVS-Studio team decided to make a nice gift for all contributors of open-source projects hosted on GitHub, GitLab or Bitbucket. They are given free usage of PVS-Studio s…
PVS-Studio ROI

Date: Jan 30 2019

Author: Andrey Karpov

Occasionally, we're asked a question, what monetary value the company will receive from using PVS-Studio. We decided to draw up a response in the form of an article and provide tables, which will sho…
Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities

Date: Nov 21 2018

Author: Andrey Karpov

A brief description of technologies used in the PVS-Studio tool, which let us effectively detect a large number of error patterns and potential vulnerabilities. The article describes the implementati…
The Last Line Effect

Date: May 31 2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, 10-15% of False Positives

Date: Jul 31 2017

Author: Andrey Karpov

After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many erro…
The Evil within the Comparison Functions

Date: May 19 2017

Author: Andrey Karpov

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want t…
Static analysis as part of the development process in Unreal Engine

Date: Jun 27 2017

Author: Andrey Karpov

Unreal Engine continues to develop as new code is added and previously written code is changed. What is the inevitable consequence of ongoing development in a project? The emergence of new bugs in th…
Appreciate Static Code Analysis!

Date: Oct 16 2017

Author: Andrey Karpov

I am really astonished by the capabilities of static code analysis even though I am one of the developers of PVS-Studio analyzer myself. The tool surprised me the other day as it turned out to be sma…

Comments (0)

Next comments
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →
Accept