Andrey Karpov

Nov 13 2017

Tags:

#Cpp

Andrey Karpov considers that code of the Manticore project is better than code of the Sphinx project

Nov 13 2017

Author: Andrey Karpov

Sphinx and Manticore
Comparative Analysis
Common Errors
Unique Errors
Let's Sum up

My readers asked me to compare the projects 'Manticore' and 'Sphinx' in terms of code quality. I can do it only with my proven method by testing projects using PVS-Studio static analyzer and figure out the error density in code. Therefore, I checked the C and C++ code in these projects and, in my opinion, the quality of code in Manticore is higher than the quality of Sphinx code. Surely, this is a very narrow view and I do not claim to be authentic in my research. However, I was asked to do this work, and I did a comparison as I could.

Sphinx and Manticore

First, let's consider Manticore and Sphinx projects.

Sphinx is a full-text search system, developed by Andrew Aksyonoff and distributed under the GNU GPL license. A distinctive feature is the high speed of indexing and search, as well as the integration with existing DBMS and APIs for the common web programming languages.

I took the source code from here. The size of the project, if you take the code in C and C++ and do not include third-party libraries is 156 KLOC. Comments constitute 10.2%. This means that the "clean code" is 144 KLOC.

Manticore Search is a fork of Sphinx. Started by core members of the original Sphinx team, Manticore's goal is to deliver fast, stable and powerful open source full-text search solution.

I took the source code from here. The size of the project, if you take the code in C and C++ and do not include third-party libraries is 170 KLOC. Comments constitute 10.1%. This means that the "clean code" is 152 KLOC.

The number of lines of code in the Manticore project is a bit greater, and I will consider it when estimating the density of found errors.

Comparative Analysis

The code of these projects is very similar, and very often the same error is present in both projects. I must say at once that this time I performed the analysis superficially and studied only general warnings of the High level issued by PVS-Studio analyzer.

Why am I being too lazy to compare projects more carefully? As I have already said, the projects are very similar. I got bored as I started viewing the High level warnings. Anyway, the whole picture is clear. The analyzer has issued a very similar list of warnings, but only in the Sphinx project, there have been slightly more of them. I think, with warnings of other levels, the situation will be exactly the same.

In the article, I will look at just some of the code fragments with errors, which for some reason seemed to be interesting for me. Their developers can run a more detailed analysis of the projects. I am willing to provide them with temporary license keys.

I also would like to suggest readers to download the demo version of PVS-Studio and check the code of your projects. I am sure you will find many interesting things.

Common Errors

I will begin with errors that were found in both Sphinx and Manticore projects.

CWE-476: NULL Pointer Dereference

Expr_StrIn_c ( const CSphAttrLocator & tLoc, int iLocator,
               ConstList_c * pConsts, UservarIntSet_c * pUservar,
               ESphCollation eCollation )
  : Expr_ArgVsConstSet_c<int64_t> ( NULL, pConsts )
  , ExprLocatorTraits_t ( tLoc, iLocator )
  , m_pStrings ( NULL )
  , m_pUservar ( pUservar )
{
  assert ( tLoc.m_iBitOffset>=0 && tLoc.m_iBitCount>0 );
  assert ( !pConsts || !pUservar );

  m_fnStrCmp = GetCollationFn ( eCollation );

  const char * sExpr = pConsts->m_sExpr.cstr();      // <=
  ....
}

I gave a quite large piece of code, but do not worry, everything is simple here. Note the formal argument pConsts. This pointer is used in the constructor to initialize the sExpr variable. Besides that, in the constructor there is no check for this case, if the NULL value is passed as an argument, i.e. there is no protection from the null pointer. pConsts variable just gets dereferenced.

Note. There is a check in the form of an assert, but it will not help in the Release-version, so this check cannot be considered as a sufficient one.

Now let's take a look at the code of the function CreateInNode, where an instance of the Expr_StrIn_c class is created:

ISphExpr * ExprParser_t::CreateInNode ( int iNode )
{
  ....
  case TOK_ATTR_STRING:
    return new Expr_StrIn_c ( tLeft.m_tLocator,
                              tLeft.m_iLocator,
                              NULL,                   // <=
                              pUservar,
                              m_eCollation );
  ....
}

The third actual argument is NULL. Accordingly, if this code fragment is executed, null pointer dereference will occur.

The analyzer signals about this error by issuing a warning: V522 Dereferencing of the null pointer 'pConsts' might take place. The null pointer is passed into 'Expr_StrIn_c' function. Inspect the third argument. Check lines: 5407, 5946. sphinxexpr.cpp 5407

This error is interesting because the PVS-Studio analyzer performs data-flow analysis, considering the bodies of two different functions. However, it is able to perform a far more complicated nested analysis. Let's consider such a case.

We will start with the function SendBytes, in which the null pointer dereference will be performed.

void ISphOutputBuffer::SendBytes ( const void * pBuf, int iLen )
{
  int iOff = m_dBuf.GetLength();
  m_dBuf.Resize ( iOff + iLen );
  memcpy ( m_dBuf.Begin() + iOff, pBuf, iLen );
}

Have a look at the pointer pBuf. It is checked nowhere and is immediately passed as an actual argument to the function memcpy. Accordingly, if the pBuf pointer is null, the data will be read from memory by null pointer inside the memcpy function call.

Why did PVS-Studio decide that there was a mistake? To answer this question, we will go higher along the control flow graph and consider the function SendMysqlOkPacket.

void SendMysqlOkPacket ( ISphOutputBuffer & tOut, BYTE uPacketID,
                         int iAffectedRows=0, int iWarns=0,
                         const char * sMessage=NULL,
                         bool bMoreResults=false )
{
  DWORD iInsert_id = 0;
  char sVarLen[20] = {0};
  void * pBuf = sVarLen;
  pBuf = MysqlPack ( pBuf, iAffectedRows );
  pBuf = MysqlPack ( pBuf, iInsert_id );
  int iLen = (char *) pBuf - sVarLen;

  int iMsgLen = 0;
  if ( sMessage )
    iMsgLen = strlen(sMessage) + 1;

  tOut.SendLSBDword ( (uPacketID<<24) + iLen + iMsgLen + 5);
  tOut.SendByte ( 0 );
  tOut.SendBytes ( sVarLen, iLen );
  if ( iWarns<0 ) iWarns = 0;
  if ( iWarns>65535 ) iWarns = 65535;
  DWORD uWarnStatus = iWarns<<16;
  if ( bMoreResults )
    uWarnStatus |= ( SPH_MYSQL_FLAG_MORE_RESULTS );
  tOut.SendLSBDword ( uWarnStatus );
  tOut.SendBytes ( sMessage, iMsgLen );
}

I am sorry that I had to give the whole body of the function. I just wanted to show that the function does not have any protection in case if the argument sMessage turns out to be equal to NULL. sMessage pointer is simply passed into the function SendBytes.

I also would like to draw your attention to the fact that the value of the formal argument sMessage is NULL by default:

const char * sMessage=NULL,

It is dangerous by itself. However, the fact that the argument is NULL by default means nothing. Perhaps, the correct arguments are always passed to function. Therefore, we will go ahead:

inline void Ok( int iAffectedRows=0, int iWarns=0,
                const char * sMessage=NULL,
                bool bMoreResults=false )
{
  SendMysqlOkPacket ( m_tOut, m_uPacketID, iAffectedRows,
                      iWarns, sMessage, bMoreResults );
  if ( bMoreResults )
    m_uPacketID++;
}

In the Ok function, the argument sMessage is simply passed to the function SendMysqlOkPacket. Let's continue.

void HandleMysqlMultiStmt (....)
{
  ....
  dRows.Ok ( 0, 0, NULL, bMoreResultsFollow );
  ....
}

At this point, we are finishing our trip. Only four actual arguments are passed to the function. The rest of the arguments take value by default. This means that the fifth argument sMessage will be equal to NULL and a null pointer dereference will occur.

PVS-Studio analyzer warning, which points at this error: V522 Dereferencing of the null pointer 'pBuf' might take place. The null pointer is passed into 'Ok' function. Inspect the third argument. Check lines: 2567, 12267, 12424, 14979. searchd.cpp 2567

CWE-570: Expression is Always False

Let's start with consideration of the ESphBinRead enumeration.

enum ESphBinRead
{
  BIN_READ_OK,        ///< bin read ok
  BIN_READ_EOF,       ///< bin end
  BIN_READ_ERROR,     ///< bin read error
  BIN_PRECACHE_OK,    ///< precache ok
  BIN_PRECACHE_ERROR  ///< precache failed
};

As you can see, there are no named constants with negative values.

Just in case, let's look at the function ReadBytes and verify that it truly returns the values without any tricks.

ESphBinRead CSphBin::ReadBytes ( void * pDest, int iBytes )
{
  ....
    return BIN_READ_EOF;
  ....
    return BIN_READ_ERROR;
  ....
  return BIN_READ_OK;
}

As you can see, all returned functions values are greater than or equal to 0. Now it is the time for code with an error:

static void DictReadEntry (....)
{
  ....
  if ( pBin->ReadBytes ( pKeyword, iKeywordLen )<0 )
  {
    assert ( pBin->IsError() );
    return;
  }
  ....
}

PVS-Studio warning: V547 Expression is always false. sphinx.cpp 22416

Such a check has no sense. The condition is always false, and, as a result, the incorrect situations when reading data are not processed. Most likely, the code here is supposed to be as follows:

if ( pBin->ReadBytes ( pKeyword, iKeywordLen ) != BIN_READ_OK)

This code demonstrates that it only seems to the author that the program will handle inappropriate situations. Actually, I meet defects in code that are responsible for the processing of incorrect/non-standard situations very often. Therefore, programs often crash when something goes wrong. The error handlers are just written incorrectly.

Sure, there is no trick why it happens so. It is difficult and uninteresting to test such parts of program. This is one of those cases when a static analyzer might be a great helper, because it checks the code regardless of how often it is executed.

CWE-14: Compiler Removal of Code to Clear Buffers

static bool GetFileStats (....)
{
  ....
  struct_stat tStat;
  memset ( &tStat, 0, sizeof ( tStat ) );
  if ( stat ( szFilename, &tStat ) < 0 )
  {
    if ( pError )
      *pError = strerror ( errno );
    memset ( &tStat, 0, sizeof ( tStat ) );   // <=
    return false;
  }
  ....
}

PVS-Studio warning: V597 The compiler could delete the 'memset' function call, which is used to flush 'tStat' object. The memset_s() function should be used to erase the private data. sphinx.cpp 19987

The compiler may delete the call of memset function, which in case of error occurring in the program, must clear private data in tStat.

I wrote many times why the compiler behaves like this, so I will not repeat myself. For those who has not faced with such situations, I suggest reading the description of the diagnostic V597 or see the description of CWE-14.

CWE-762: Mismatched Memory Management Routines

To begin with, we need to look at the implementation of two macros:

#define SafeDelete(_x) \
  { if (_x) { delete (_x); (_x) = nullptr; } }
#define SafeDeleteArray(_x) \
  { if (_x) { delete [] (_x); (_x) = nullptr; } }

Now, I think you can easily detect the error yourself in this code:

int CSphIndex_VLN::DebugCheck ( FILE * fp )
{
  ....
  CSphRowitem * pInlineStorage = NULL;
  if ( pQword->m_iInlineAttrs )
    pInlineStorage = new CSphRowitem [ pQword->m_iInlineAttrs ];
  ....
  // cleanup
  SafeDelete ( pInlineStorage );
  ....
}

PVS-Studio warning: V611 The memory was allocated using 'new T[]' operator but was released using the 'delete' operator. Consider inspecting this code. It's probably better to use 'delete [] pInlineStorage;'. sphinx.cpp 19178

As you can see, the memory is allocated as for an array, and is deallocated, as if only one item was created. Instead of macro, SafeDelete the macro SafeDeleteArray should be used here.

Unique Errors

I considered a few errors above that reveal themselves both in Manticore and Sphinx code. Of course, there are errors inherent to only one project. For example, let's consider one case.

Both projects have a function RotateIndexMT. However, it is implemented differently. In Sphinx project implementation, this function contains a CWE-690 defect (Unchecked Return Value to NULL Pointer Dereference).

First let's look at the declaration of the function CheckServedEntry:

static bool CheckServedEntry(const ServedIndex_c * pEntry, // <=
                             const char * sIndex,
                             CSphString & sError );

The first argument is a pointer to a constant object. Therefore, the function cannot modify this object and the pointer itself.

Now here is the function, containing an error:

static bool RotateIndexMT ( .... )
{
  ....
  ServedIndex_c * pServed =
    g_pLocalIndexes->GetWlockedEntry ( sIndex );
  pServed->m_sNewPath = "";                            // <=
  if ( !CheckServedEntry ( pServed, sIndex.cstr(), sError ) )
  {
    if ( pServed )                                     // <=
      pServed->Unlock();
    return false;
  }
  ....
}

PVS-Studio warning: V595 The 'pServed' pointer was utilized before it was verified against nullptr. Check lines: 17334, 17337. searchd.cpp 17334

First, the pServed pointer gets dereferenced. Secondly, the function CheckServedEntry is called, which, as we have found out, cannot change the pServed pointer, being passed as the first actual argument.

Then the pServed pointer is verified against NULL. Gotcha! The pointer actually can be null. Therefore, as we can see above, before the first dereferencing, the verification should be added.

Another option: if (pServed) check is not needed, if the pointer is never equal to NULL. In any case, the code must be fixed.

Let's Sum up

Sphinx project is smaller than a Manticore project by size. At the same time, in the Sphinx project I noticed more errors and 'code smell', than in Manticore project.

Taking into account the size of the projects and the number of noticed defects, I got the following result. Let's take the density of errors in Manticore for 1. Then the error density of Sphinx project by my rough estimation is 1.5.

My conclusions. The error density of the Sphinx project in one and a half times higher comparing with the Manticore project. Therefore, code quality of Manticore is better than the Sphinx project's one. Fork turned out better than the original.

Again, that is my subjective opinion based on a very small amount of information. Error density in code of some components does not define the quality and reliability of a project as a whole.

Download and try PVS-Studio. It is simple. In the end, even if you write the perfect code, you can always search for errors in the code of your colleagues :).

Thank you for your attention. Subscribe to Twitter or RSS to be informed about our new publications.

#Cpp