There exists the NetXMS project, which is a software product designed to monitor computer systems and networks. It can be used to monitor the whole IT-infrastructure, from SNMP-compatible devices to server software. And I am naturally going to monitor the code of this project with the PVS-Studio analyzer.
Links:
The NetXMS project is an open-source project distributed under the GNU General Public License v2. The code is written in the languages C, C++ and Java.
The project depends on a number of third-party libraries. To be honest, I felt too lazy to download some of them to get the project built. That's why it was checked not in full. Nevertheless, it doesn't prevent me from writing this post: my analysis is superficial anyway. It will be much better if the project's authors check it themselves. They are welcome to write to our support service: I will generate a temporary registration key for the PVS-Studio analyzer so that they could analyze it more thoroughly.
In the articles describing checks of open-source projects, I let myself be carried away with citing general errors. But 64-bit errors have not disappeared; they can be found everywhere. They are just not that interesting to discuss. When you show null pointer dereferencing, the bug is obvious. When you tell that a 32-bit variable can overflow in a 64-bit application, it's not that interesting. A coincidence of some certain circumstances must happen for such an error to occur; so you have to speak of it as a "potential error".
Moreover, it's much more difficult to detect 64-bit bugs. The rule set designed for 64-bit error detection produces a whole lot of false positives. The analyzer doesn't know the permissible range of input values and attacks everything it finds at least a bit suspicious. To find really dangerous fragments, you have to review a lot of messages; this is the only way to make sure that the program has been correctly ported to the 64-bit platform. It is especially true for applications that use more than 4 Gbytes of memory.
So, to be brief, writing articles about catching common bugs is much easier than writing about catching 64-bit ones. But this time I overcame my laziness and found several dangerous fragments of that kind. Let's start with them.
BOOL SortItems(...., _In_ DWORD_PTR dwData);
void CLastValuesView::OnListViewColumnClick(....)
{
....
m_wndListCtrl.SortItems(CompareItems, (DWORD)this);
....
}
V220 Suspicious sequence of types castings: memsize -> 32-bit integer -> memsize. The value being casted: 'this'. lastvaluesview.cpp 716
Earlier, in 32-bit systems, the pointer's size was 4 bytes. When you needed to save or pass a pointer as an integer type, you used the types DWORD, UINT and so on. In 64-bit systems the pointer's size has grown to 8 bytes. To store them in integer variables the types DWORD_PTR, UINT_PTR and some others were created. Function interfaces have changed accordingly. Note the way the SortItems() function is declared in the first line of the sample.
Unfortunately, the program still contains a conversion of a pointer to the 32-bit DWORD type. The program is compiled successfully. The pointer is explicitly cast to the 32-bit DWORD type and then inexplicitly extended to DWORD_PTR. The worst thing is that the program works well in most cases.
It will work until the CLastValuesView class's instances are created within the 4 low-order Gbytes of memory - that is, almost always. But it might happen that the program needs more memory. Or, memory fragmentation happens after a long run. The object will then be created outside the 4 Gbytes, and the error will reveal itself. The pointer will lose the 32 high-order bits, and the program's behavior will become undefined.
The bug is very easy to fix:
m_wndListCtrl.SortItems(CompareItems, (DWORD_PTR)this);
There are some other fragments with similar type conversions:
Each of these is a sliest bug; they are often very hard to reproduce. As a result, you get VERY RARE crashes after a long run.
The next error seems to be not that critical. A poorly calculated hash code, however, can cause search algorithms to slow down.
static int hash_void_ptr(void *ptr)
{
int hash;
int i;
/* I took this hash function just off the top of my head,
I have no idea whether it is bad or very bad. */
hash = 0;
for (i = 0; i < (int)sizeof(ptr)*8 / TABLE_BITS; i++)
{
hash ^= (unsigned long)ptr >> i*8;
hash += i * 17;
hash &= TABLE_MASK;
}
return hash;
}
V205 Explicit conversion of pointer type to 32-bit integer type: (unsigned long) ptr xmalloc.c 85
The author writes in the comment that he is not sure if the function works well. And he's right. At the least, here is a bug when casting the pointer to the 'unsigned long' type.
The data models used in Windows and Linux systems are different. In Linux, the LP64 data model is accepted. In this model the 'long' type's size is 64 bits. Thus, this code will work as intended under Linux systems.
In Win64, the 'unsigned long' type's size is 32 bits. As a result, the high-order part of the pointer gets lost, and the hash is calculated not that well.
It is not solely because of explicit type conversions that 64-bit errors occur. But errors of this type are much easier to detect - for me as well. That's why let's have a look at one more poor type conversion.
static int ipfix_print_newmsg(....)
{
....
strftime(timebuf, 40, "%Y-%m-%d %T %Z",
localtime( (const time_t *) &(hdr->u.nf9.unixtime) ));
....
}
V114 Dangerous explicit type pointer conversion: (const time_t *) & (hdr->u.nf9.unixtime) ipfix_print.c 68
This is how the 'unixtime' class's member is declared:
uint32_t unixtime; /* seconds since 1970 */
And this is how the type 'time_t' is declared:
#ifdef _USE_32BIT_TIME_T
typedef __time32_t time_t;
#else
typedef __time64_t time_t;
#endif
As far as I can tell, the _USE_32BIT_TIME_T macro is not declared anywhere in the project. I didn't manage to find it, at least. It means that the localtime() function must handle time values represented by 64-bit variables, while it is an address of a 32-bit variable that is passed into the function in our sample. It's no good. The function localtime() will be handling trash.
I suppose the readers can see now why I'm not fond of writing about 64-bit errors. They are too plain and unconvincing. I don't feel like going on to search for other samples to show you at all. Let's instead study some general bugs. They look much more impressive and dangerous.
Nevertheless, 64-bit errors still exist, and if you care about the quality of your 64-bit code, I advise you to keep the viva64 diagnostic rule set at hand. These errors will stay hidden for a longer time than common bugs. For you to get scared, I recommend the following reading for the night:
In Linux, the SOCKET type is declared as a signed variable. In Windows, this type is unsigned:
typedef UINT_PTR SOCKET;
This difference often causes bugs in Windows programs.
static int DoRadiusAuth(....)
{
SOCKET sockfd;
....
// Open a socket.
sockfd = socket(AF_INET, SOCK_DGRAM, 0);
if (sockfd < 0)
{
DbgPrintf(3, _T("RADIUS: Cannot create socket"));
pairfree(req);
return 5;
}
....
}
V547 Expression 'sockfd < 0' is always false. Unsigned type value is never < 0. radius.cpp 682
The 'sockfd' variable is of the UINT_PTR type. It results in that the 'sockfd < 0' condition never holds when the program runs under Windows. The program will try in vain to handle the socket which has not been opened.
You should fight your laziness and use special constants. This is what the code should look like:
if (sockfd == SOCKET_ERROR)
Similar incorrect checks can be found in the following fragments:
int ipfix_snprint_string(....)
{
size_t i;
uint8_t *in = (uint8_t*) data;
for( i=len-1; i>=0; i-- ) {
if ( in[i] == '\0' ) {
return snprintf( str, size, "%s", in );
}
}
....
}
V547 Expression 'i >= 0' is always true. Unsigned type value is always >= 0. ipfix.c 488
The 'i' variable has the size_t type. It means that the check "i>=0" is pointless. If zero is not found on the stack, the function will start reading memory far outside the array's boundaries. Consequences of this may be very diverse.
bool CatalystDriver::isDeviceSupported(....)
{
DWORD value = 0;
if (SnmpGet(snmp->getSnmpVersion(), snmp,
_T(".1.3.6.1.4.1.9.5.1.2.14.0"),
NULL, 0, &value, sizeof(DWORD), 0)
!= SNMP_ERR_SUCCESS)
return false;
// Catalyst 3550 can return 0 as number of slots
return value >= 0;
}
V547 Expression 'value >= 0' is always true. Unsigned type value is always >= 0. catalyst.cpp 71
One of the most common error patterns is confusion of WCHAR strings' sizes. You can find quite a number of examples in our bug database.
typedef WCHAR TCHAR, *PTCHAR;
static BOOL MatchProcess(....)
{
....
TCHAR commandLine[MAX_PATH];
....
memset(commandLine, 0, MAX_PATH);
....
}
V512 A call of the 'memset' function will lead to underflow of the buffer 'commandLine'. procinfo.cpp 278
The TCHAR type is expanded into the WCHAR type. The number of characters in the array 'commandLine' equals the value MAX_PATH. The size of this array is 'MAX_PATH * sizeof(TCHAR). The 'memset' function handles bytes. It means that the mechanism needed to correctly clear the buffer should look like this:
memset(commandLine, 0, MAX_PATH * sizeof(TCHAR));
An even better way is to make it like this:
memset(commandLine, 0, sizeof(commandLine));
The CToolBox class is sick in the same way:
typedef WCHAR TCHAR, *PTCHAR;
#define MAX_TOOLBOX_TITLE 64
TCHAR m_szTitle[MAX_TOOLBOX_TITLE];
CToolBox::CToolBox()
{
memset(m_szTitle, 0, MAX_TOOLBOX_TITLE);
}
V512 A call of the 'memset' function will lead to underflow of the buffer 'm_szTitle'. toolbox.cpp 28
In the findIpAddress() function, a null pointer may get dereferenced. The reason is a copied-and-pasted line.
void ClientSession::findIpAddress(CSCPMessage *request)
{
....
if (subnet != NULL)
{
debugPrintf(5, _T("findIpAddress(%s): found subnet %s"),
ipAddrText, subnet->Name());
found = subnet->findMacAddress(ipAddr, macAddr);
}
else
{
debugPrintf(5, _T("findIpAddress(%s): subnet not found"),
ipAddrText, subnet->Name());
}
....
}
V522 Dereferencing of the null pointer 'subnet' might take place. session.cpp 10823
The call of the debugPrintf() function was obviously copied. But the call in the 'else' branch is incorrect. The pointer 'subnet' equals NULL. It means that you cannot write "subnet->Name()".
#define CF_AUTO_UNBIND 0x00000002
bool isAutoUnbindEnabled()
{
return ((m_flags & (CF_AUTO_UNBIND | CF_AUTO_UNBIND)) ==
(CF_AUTO_UNBIND | CF_AUTO_UNBIND)) ? true : false;
}
V578 An odd bitwise operation detected: m_flags & (0x00000002 | 0x00000002). Consider verifying it. nms_objects.h 1410
The expression (CF_AUTO_UNBIND | CF_AUTO_UNBIND) is very strange. It seems that two different constants should be used here.
void I_SHA1Final(....)
{
unsigned char finalcount[8];
....
memset(finalcount, 0, 8);
SHA1Transform(context->state, context->buffer);
}
V597 The compiler could delete the 'memset' function call, which is used to flush 'finalcount' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. sha1.cpp 233
In functions related to cryptography, it is an accepted practice to clear temporary buffers. If you don't do that, consequences may be interesting: for instance, a fragment of classified information may be unintentionally sent to the network. Read the article "Overwriting memory - why?" to find out the details.
The function memset() is often used to clear memory. It is incorrect. If the array is not being used after the clearing, the compiler may delete the function memset() for the purpose of optimization. To prevent this you should use the function RtlSecureZeroMemory().
Many programmers are convinced that use of uninitialized variables is the most annoying and frequent bug. Judging by my experience of checking various projects, I don't believe it's true. This bug is very much discussed in books and articles. Thanks to that, everybody knows what uninitialized variables are, what is dangerous about them, how to avoid and how to find them. But personally I feel that much more errors are caused, say, through using Copy-Paste. But, of course, it doesn't mean that uninitialized variables are defeated. Here they are.
int OdbcDisconnect(void* pvSqlCtx)
{
....
SQLRETURN nSqlRet;
....
if (nRet == SUCCESS)
{
....
nSqlRet = SQLDisconnect(pSqlCtx->hDbc);
....
}
if (SQLRET_FAIL(nSqlRet))
....
}
V614 Potentially uninitialized variable 'nSqlRet' used. odbcsapi.cpp 220
The nSqlRet variable becomes initialized only if we get into the 'if' operator's body. But it is checked after that all the time. It results in this variable's sometimes storing a random value.
Here are some other places where variables may be initialized not all the time:
It is a very common situation that due to refactoring a pointer check is put after a pointer dereferencing operation in the program text. A lot of examples can be found here.
To detect this error pattern the V595 diagnostic is used. The number of such defects found in code often reaches many dozens. To NetXMS's credit, however, I noticed only one code fragment of that kind:
DWORD SNMP_PDU::encodeV3SecurityParameters(....,
SNMP_SecurityContext *securityContext)
{
....
DWORD engineBoots =
securityContext->getAuthoritativeEngine().getBoots();
DWORD engineTime =
securityContext->getAuthoritativeEngine().getTime();
if ((securityContext != NULL) &&
(securityContext->getSecurityModel() ==
SNMP_SECURITY_MODEL_USM))
{
....
}
V595 The 'securityContext' pointer was utilized before it was verified against nullptr. Check lines: 1159, 1162. pdu.cpp 1159
There were some other V595 warnings, but I found them too unconvincing to mention in the article. Those must be just unnecessary checks.
Errors occurring when using the printf() and other similar functions are classic ones. The reason is that variadic functions don't control the types of the arguments being passed.
#define _ftprintf fwprintf
static __inline char * __CRTDECL ctime(const time_t * _Time);
BOOL LIBNETXMS_EXPORTABLE SEHServiceExceptionHandler(....)
{
....
_ftprintf(m_pExInfoFile,
_T("%s CRASH DUMP\n%s\n"),
szProcNameUppercase,
ctime(&t));
....
}
V576 Incorrect format. Consider checking the fourth actual argument of the 'fwprintf' function. The pointer to string of wchar_t type symbols is expected. seh.cpp 292
The _ftprintf() macro is expanded into the function fwprintf(). The format string specifies that strings of the 'wchar_t *' type must be passed into the function. But the ctime() function returns a string consisting of 'char' characters. This bug must be left unnoticed, as it is situated inside the error handler.
Here are two more errors of that kind:
The 'new' operator earlier used to return 'NULL' when it failed to allocate memory. Now it throws an exception. Many programs don't take this change into account. It doesn't matter sometimes, but in some cases it may cause failures. Take a look at the following code fragment from the NetXMS project:
PRectangle CallTip::CallTipStart(....)
{
....
val = new char[strlen(defn) + 1];
if (!val)
return PRectangle();
....
}
V668 There is no sense in testing the 'val' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. calltip.cpp 260
The empty object 'PRectangle' was returned earlier if memory couldn't be allocated. Now an exception is generated when there is memory shortage. I don't know whether or not this behavior change is critical. Anyway, checking the pointer for being a null pointer doesn't seem reasonable anymore.
We should either remove the checks or use the 'new' operator that doesn't throw exceptions and returns zero:
val = new (std::nothrow) char[strlen(defn) + 1];
The PVS-Studio analyzer generates too many V668 warnings on the NetXMS project. Therefore I won't overload the article with examples. Let's leave it up to the authors to check the project.
static bool MatchStringEngine(....)
{
....
// Handle "*?" case
while(*MPtr == _T('?'))
{
if (*SPtr != 0)
SPtr++;
else
return false;
MPtr++;
break;
}
....
}
V612 An unconditional 'break' within a loop. tools.cpp 280
The loop body is executed not more than once. The keyword 'break' inside it must be unnecessary.
I haven't drawn any new conclusions from the check of the NetXMS project. Errors are everywhere; some of them can be found with static analysis - the earlier, the better.
I'll just give you some interesting and useful links instead of the conclusion: