Webinar: Parsing C++ - 10.10
In this article, I will tell and show you how to carry out static analysis of C/C++ program code by PVS-Studio by the example of the open-source project Wireshark. We'll start with a brief description of the Wireshark network traffic analyzer and the PVS-Studio product. Then I will tell you about the pitfalls you may encounter when building the project and preparing for the analysis. After that, I'll try to draw a general overview of the PVS-Studio product, its strengths and usability by means of examples of its warnings, the corresponding code samples, and my own comments.
To demonstrate PVS-Studio's capabilities, I needed a well-known, useful, and interesting open-source project that hadn't been analyzed yet. I settled on Wireshark because I personally have a liking for it, and if you are not familiar with this product yet, perhaps you too will start sharing my feelings towards it after reading this article.
The Internet's rapid progress and plenty of movies about hackers had long ago attracted my attention to computer networks. And now I am convinced that every competent system administrator and programmer dealing with security have to know their way around network technologies.
Networks are based on the mechanism of data transmission and receiving via certain protocols. To make it possible to investigate network applications and protocols and to detect problems with network functioning and, most importantly, find out the causes behind them, we need special network traffic capture and analysis tools (aka sniffers).
Wireshark is quite a well-known sniffer with a GUI. The program is based on the Pcap library designed for capturing network traffic and allows parsing packages of most of the popular protocols, displaying values for each field of a protocol, regardless of its level.
Wireshark is a cross-platform tool released under the terms of the GNU GPL. It runs both on Windows and Linux and makes use of the GTK+ and Qt libraries to implement its user interface.
The corresponding documentation and source files of the program can be found at the official site.
Static code analysis enables software bug detection without actually running the application and regardless of its working environment. By using static analysis, you can improve your software product's quality, reduce its development and testing time, and ensure its security.
PVS-Studio is a static analyzer for C/C++/C++11 code and supports such compilers as MS Visual C++, GNU GCC (MinGW), Clang, Borland C++.
PVS-Studio ships with the following diagnostic rule sets:
To find out more about PVS-Studio, welcome to the official site.
To carry out the analysis, we need to download the source files of the latest stable version of Wireshark 1.12.4. I ran the building process on Windows 7 with Win64 as the target platform, using Visual Studio 2013's native compiler. We may also need to install the libraries Qt SDK 5.4.1 and WinPcap 4.1.3.
I managed the building process from the command line using nmake. To ensure correct work of the build scripts, we need to install Cygwin and Python 2.7.9.
For more details on the building process, check the information at the site.
Despite that I was doing everything in total compliance with the instructions, I still encountered a few errors on the way. To eliminate them, I had to do the following:
I have a licensed PVS-Studio 5.25 version installed on my computer, but you may as well use the demo version available for download to get started with the tool.
In the demo version, you can only work with the first-level warnings and have only 50 clicks on the diagnostic messages to quickly get to the corresponding code, plus 50 more clicks after filling in a form at the site. Once you've used up your 100 clicks, you'll need to buy a license. To find out the details about the licensing terms, see the corresponding section at the site. Of course, these 100 clicks are not enough for regular use and are granted to you just to get started with the tool. If you want to study it closer, you can write to the support service and ask them for a free registration key for a few days.
Since the Wireshark project is built through nmake from the command line, we'll need a monitoring system that comes with the PVS-Studio package. It is designed to monitor compiler launches and gather information about their environment: the working folder, the command line, the full path to the file being compiled, and the process' environment variables.
To start monitoring, open "Start\PVS-Studio\PVS-Studio Standalone", select the "Tools\Analyze Your Files ..." menu item, and click on the "Start Monitoring" button. After that, launch the project building process from the command line "nmake -f Makefile.nmake all", as described above. Make sure the build has finished successfully and stop the monitoring process by clicking on the "Stop monitoring" button.
Now we have to be patient because static analysis will start automatically right after that. Once it's finished, save the report plog file so that you don't have to repeat the build and analysis operations multiple times.
You can already start searching for bugs with PVS-Studio Standalone at this stage. But, in order to use IntelliSense's advanced code navigation features, I recommend opening the previously saved report file in Microsoft Visual Studio.
To do this, we need to do the following:
We are finally approaching the most interesting stage - bug search.
Let's start our bug hunt by viewing PVS-Studio's diagnostic messages and using IntelliSense's navigation.
From the very start, my attention was caught by the following comments in the code:
void decode_ex_CosNaming_NamingContext_NotFound(....)
{
....
(void)item; /* Avoid coverity param_set_but_unused
parse warning */
....
/* coverity[returned_pointer] */
item = proto_tree_add_uint(....);
....
}
The Wireshark project seems to be regularly checked by the Coverity static analyzer already. This analyzer is used in projects with high security requirements such as software for medical equipment, nuclear plants, aviation, and, since recently, embedded systems. Now that we've found this out, I'm curious to see if we can find any bugs overlooked by Coverity.
To draw a general overview of PVS-Studio's capabilities, we'll discuss errors of different patterns which are difficult to detect because of the program's undefined behavior during the testing stage or require advanced knowledge of the C/C++ languages or are simply interesting. Investigating the first-level warnings and a quick scan through the second-level ones will be enough for our task.
Example:
typedef struct AIRPDCAP_SEC_ASSOCIATION {
....
AIRPDCAP_KEY_ITEM *key;
....
};
void AirPDcapWepMng(....,AIRPDCAP_KEY_ITEM* key,
AIRPDCAP_SEC_ASSOCIATION *sa, ....)
{
....
memcpy(key, &sa->key, sizeof(AIRPDCAP_KEY_ITEM));
....
}
PVS-Studio's diagnostic message: V512 A call of the 'memcpy' function will lead to the '& sa->key' buffer becoming out of range. airpdcap.c 1192
The C/C++ languages provide for efficient low-level memory handling due to the absence of integrated array limit checks while reading and writing. Buffer filling, copying, and comparison errors can cause undefined behavior or segmentation errors which are hard to detect.
To fill the 'AIRPDCAP_KEY_ITEM' structure found at the address 'key', the address 'sa->key' to the same structure should have been used. The programmer, however, used the address of the pointer to it instead. To fix this error, we just need to remove a superfluous address taking operation '&'.
Example:
typedef struct _h323_calls_info {
e_guid_t *guid;
....
} h323_calls_info_t;
static const e_guid_t guid_allzero = {0, 0, 0,
{ 0, 0, 0, 0, 0, 0, 0, 0 } };
void q931_calls_packet(....)
{
h323_calls_info_t *tmp2_h323info;
....
memcmp(&tmp2_h323info->guid, &guid_allzero, 16) == 0;
....
}
PVS-Studio's diagnostic message: V512 A call of the 'memcmp' function will lead to overflow of the buffer '& tmp2_h323info->guid'. voip_calls.c 1570
Another example of an incorrect use of a buffer. In one of the 'memcmp()' function's arguments, a pointer to the pointer to the 'e_guid_t' structure is passed, instead of the pointer to this structure.
Example:
#define ETHERCAT_MBOX_HEADER_LEN ((int) sizeof(ETHERCAT_MBOX_HEADER))
void dissect_ecat_datagram(....)
{
if (len >= sizeof(ETHERCAT_MBOX_HEADER_LEN) && ....)
{
....
}
}
PVS-Studio's diagnostic message: V568 It's odd that the argument of sizeof() operator is the '(int) sizeof (ETHERCAT_MBOX_HEADER)' expression. packet-ethercat-datagram.c 519
When handling memory in C++, the 'sizeof()' operator is used to return the size of an object or buffer in bytes. In our case, 'sizeof()' will return the size of the 'int' type, instead of the 'ETHERCAT_MBOX_HEADER' structure's size. To fix the error, we need to remove a superfluous 'sizeof()' operation.
Example:
void Proto_new(....) {
....
if (!name[0] || !desc[0])
luaL_argerror(L,WSLUA_ARG_Proto_new_NAME,
"must not be an empty string");
....
if ( name ) {
....
loname_a = g_ascii_strdown(name, -1);
....
}
....
}
PVS-Studio's diagnostic message: V595 The 'name' pointer was utilized before it was verified against nullptr. Check lines: 1499, 1502. wslua_proto.c 1499
To show that a pointer doesn't refer to an object, programmers usually write a special zero value into it and implement additional checks before using it. With the help of static analysis, you can find missing checks that may cause security breaches, and extraneous checks that clutter the code.
The 'name' pointer is checked after using 'name[0]'. On the one hand, this check is superfluous if the pointer is not null; on the other, an error will occur anyway if it is.
Example:
void create_byte_graph(....)
{
....
u_data->assoc=(sctp_assoc_info_t*)g_malloc(
sizeof(sctp_assoc_info_t));
u_data->assoc=userdata->assoc;
....
}
PVS-Studio's diagnostic message: V519 The 'u_data->assoc' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1526, 1527. sctp_byte_graph_dlg.c 1527
In C/C++, memory allocation and freeing is done manually. Memory freeing errors can cause memory leaks.
The 'g_malloc()' function allocates an area of dynamic memory of the size of 'sizeof(sctp_assoc_info_t)' bytes and returns a pointer to it. But after changing the variable storing this pointer, we will be able neither to access this area nor free it, which will result in a memory leak.
Example:
PacketList::PacketList(QWidget *parent)
{
QMenu *submenu;
....
submenu = new QMenu(tr("Colorize with Filter"));
/*ctx_menu_.addMenu(submenu);*/
submenu = new QMenu(tr("Copy"));
ctx_menu_.addMenu(submenu);
....
}
PVS-Studio's diagnostic message: V519 The 'submenu' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 287, 363. packet_list.cpp 363
In the constructor, visual interface elements are dynamically created and added into Qt's object hierarchy. It enables the programmer to carry out recursive destruction of created objects when deleting a high-level object. However, one of the menu items has not been added into the object hierarchy, which will cause a memory leak.
Example:
void dissect_display_switch(gint offset, guint msg_len, ....)
{
....
if((address_byte&DISPLAY_WRITE_ADDRESS_LINE_FLAG)
!=DISPLAY_WRITE_ADDRESS_LINE_FLAG)
offset+=1;msg_len-=1;
....
}
PVS-Studio's diagnostic message: V640 The code's operational logic does not correspond with its formatting. The second statement will always be executed. It is possible that curly brackets are missing. packet-unistim.c 1134
Incorrect use of braces '{}' when setting off blocks of conditional operators 'if' can lead to errors, too.
In this code, the body of the conditional operator 'if' consists of one statement, though the program's formatting and logic require that there should be more than one statement. To fix the error, we need to enclose a number of statements in braces '{}'.
Example:
void dissect_ssc_readposition (....)
{
....
switch (service_action) {
....
case LONG_FORM:
if (!(flags & MPU)) {
....
} else
/*offset += 16;*/
break;
....
}
....
}
PVS-Studio's diagnostic message: V705 It is possible that 'else' block was forgotten or commented out, thus altering the program's operation logics. packet-scsi-ssc.c 831
It's a funny thing, but sometimes one single comment can change the program's execution logic. The program will leave the 'case LONG_FORM' block only when 'else' is triggered, which will inevitably cause an error.
Example:
void set_has_console(gboolean set_has_console)
{
has_console = has_console;
}
PVS-Studio's diagnostic message: V570 The 'has_console' variable is assigned to itself. console_win32.c 235
There are also errors caused by inattention in the Wireshark project. In the code above, the programmer assumes that the 'set_has_console()' function changes the 'has_console' value to 'set_has_console', which is wrong. To fix the error, the 'has_console' variable must be assigned the value passed through the 'set_has_console' argument.
Example:
void dissect_dcc(tvbuff_t *tvb, packet_info *pinfo,
proto_tree *tree, void *data _U_)
{
client_is_le = ( (tvb_get_guint8(tvb, offset+4)
| tvb_get_guint8(tvb, offset+4))
&&(tvb_get_guint8(tvb, offset+8)
| tvb_get_guint8(tvb, offset+9))
&& (tvb_get_guint8(tvb, offset+12)
| tvb_get_guint8(tvb, offset+13)) );
}
PVS-Studio's diagnostic message: V501 There are identical sub-expressions 'tvb_get_guint8(tvb, offset + 4)' to the left and to the right of the '|' operator. packet-dcc.c 272
The tvb_get_guint8(tvb, offset+4) expression is used twice. By analogy with the rest of the code, we can assume that the programmer actually meant to write tvb_get_guint8(tvb, offset+5).
There also were some other errors I haven't mentioned here in order not to clutter the article. I believe the examples discussed above are quite enough for you to grasp the general idea about static analysis' capabilities, and for me to attract your attention to PVS-Studio. If you need to fully investigate PVS-Studio's features, visit the site to see the complete list of its diagnostic messages. A more thorough analysis of the Wireshark project is a task to be carried out by its authors, for they will naturally find it much easier to figure out which of the warnings refer to real bugs and which don't.
The total number of suspicious code fragments found in this project is relatively small. This is probably thanks to the use of the Coverity static analyzer mentioned in the comments. So my advice is to use static analyzers regularly to detect bugs at the coding stage, before they get to the testing stage.
Good luck in programming and may you have as few bugs as possible!
0