The analysis of the TDengine project code using PVS-Studio reveals classic errors and typos. Developers could've avoided many of them if they had carefully designed the code in the first place, kept the logic simple, and avoided macros. Let's take a look at these errors and figure out how refactoring can help eliminate them completely.
This meme is the easiest way to illustrate what I mean.
Unfortunately, Java programmers aren't the only ones writing "sausage code". Long, hard-to-read lines of code appear when a programmer doesn't want to break them up into multiple lines and nicely format them. Since such code is difficult to read, the programmer or their team members are less likely to catch errors during the code review.
It's safe to say that long lines of code attract bugs and typos. I don't have any statistics, but I frequently find bugs using PVS-Studio in exactly this type of long, monotonous, or poorly formatted code.
Let's look at classic "sausage code" from the TDengine project.
TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors.
Here it is, in the dmTransport.c (398) file.
The PVS-Studio warning: V501 There are identical sub-expressions 'msgType == TDMT_VND_S3MIGRATE' to the left and to the right of the '||' operator. dmTransport.c 398
Indeed, if we look closely, we can see that the same constant is checked twice. However, seeing this error is easy only when you know it's there. In reality, it's quite difficult to read this code and notice the repeated check.
Let's enhance the code. First, we'll break it down so that each line contains only one condition—this will make the typo much more noticeable.
static bool rpcNoDelayMsg(tmsg_t msgType) {
if (msgType == TDMT_VND_FETCH_TTL_EXPIRED_TBS ||
msgType == TDMT_VND_S3MIGRATE ||
msgType == TDMT_VND_S3MIGRATE ||
msgType == TDMT_VND_QUERY_COMPACT_PROGRESS ||
msgType == TDMT_VND_DROP_TTL_TABLE) {
return true;
}
return false;
}
Let's go further—we'll use table-style formatting and align everything in columns.
static bool rpcNoDelayMsg(tmsg_t msgType) {
if ( msgType == TDMT_VND_FETCH_TTL_EXPIRED_TBS
|| msgType == TDMT_VND_S3MIGRATE
|| msgType == TDMT_VND_S3MIGRATE
|| msgType == TDMT_VND_QUERY_COMPACT_PROGRESS
|| msgType == TDMT_VND_DROP_TTL_TABLE) {
return true;
}
return false;
}
Now the anomaly is clearly visible, so it would take a lot of effort to miss it.
Since we don't know the specifics of the project being checked, let's assume that the double check is just redundant, and msgType
shouldn't be compared to some other constant instead. We delete it and get some more refactoring.
static bool rpcNoDelayMsg(tmsg_t msgType) {
return msgType == TDMT_VND_FETCH_TTL_EXPIRED_TBS
|| msgType == TDMT_VND_S3MIGRATE
|| msgType == TDMT_VND_QUERY_COMPACT_PROGRESS
|| msgType == TDMT_VND_DROP_TTL_TABLE;
}
Lovely! We also shortened the function body from 5 lines (the very first version) to 4.
Summary: by formatting code in a beautiful way, we:
0