PVS-Studio is a static code analyzer detecting errors and potential vulnerabilities in the code of applications written in C, C++, and C#. We've been entertaining the readers of our blog for a long time by checking various open-source projects and commenting on the bugs found. Now our articles have every chance to become even more interesting because PVS-Studio can now check the code of embedded devices. We have added support for a few ARM compilers, which I talk about in detail below. Bugs in embedded devices and robots could be more of a spectacle than bugs in regular applications. When showing up in such devices, bugs don't simply cause a program to crash or hang, or display an image incorrectly; they make Wi-Fi kettles go mad and boil the water until it's all gone and the thermostat trips. That is, bugs tend to be more interesting and creepy in the world of embedded systems.
I've made a lot of coding mistakes over my programming career. Those errors were, however, somewhat dull. They just made programs misbehave or dereference a null pointer, and so on. Yes, those were true bugs that needed fixing. But the most impressive error I've ever made was when I was tinkering with my homemade robots.
I'm just a layman in robotics, and I've built all my machines just for the sake of fun and experiment. Among others, I've built four small remotely controlled robots that could play robot football and "catch the mouse". Without going into detail, they could do the following: move on their wheels, hit the ball, grasp with their claws, make sounds, and flash their LEDs. Here's a photo of one of those things as proof (click on the image to enlarge):
This bot is based on the ATmega8A microcontroller (8 Kbyte Flash, 512 byte EEPROM, 1 Kbyte RAM). In the first version of the firmware, one of the microcontroller timers would generate an interrupt whose handler was reading the commands transmitted from the remote control. A command received was written to a FIFO buffer from which it would then be retrieved and executed in the main program loop. Those commands were: move forward/backward; turn left/right; move forward turning slightly to the left; grasp the mouse; hit the ball; etc.
My implementation was actually too complicated. I got rid of the FIFO buffer later and rewrote the entire program in a simpler and neater form.
Now imagine me uploading the new firmware to the microcontroller, switching the robot on, and... The bot suddenly starts living on its own!
Here it is, moving across the floor chaotically, snapping its claw, hitting an invisible ball, flashing its lights. The funniest thing is that I don't even know why. There's simply no code in its firmware, I believe, that would make it do all these things.
It was the strongest impression of a software bug I've ever had over all my years as a programmer. It's one thing having a program crash because of a stack overflow, and quite another seeing a crazy robot rushing about, a robot built by you, without your having a slightest idea of how that could become possible. I wish I was smart enough at the time to take a video of the happening, with my facial expression in the background :).
It didn't take me long to figure out that I had made one of the most classic programming mistakes, namely forgetting to initialize the variable storing the number of unprocessed commands in the FIFO buffer. The robot started executing a random sequence of commands, reading the data from the buffer, as well as from beyond it.
Why am I telling you this? Well, I just wanted to show you that bugs in microcontroller firmware could be more dramatic than those in regular applications, and I hope you will also enjoy my other articles to come. Now, let's get back to the subject of this one, which is the release of the new PVS-Studio version.
In the new version of the analyzer, PVS-Studio 6.22, our team has improved its mechanics to enable it to check projects built with the following compilers:
I needed an open-source project to demonstrate PVS-Studio's new capabilities, so I went for RT-Thread. This project can be built in the gcc/keil/iar modes. For the sake of additional testing, we checked it in both Keil and IAR modes. The logs were almost identical, so I don't even remember which I was working with when writing this article.
Now I should say a few words about the RT-Thread project itself.
RT-Thread is an open source IoT operating system from China, which has strong scalability: from a tiny kernel running on a tiny core, for example ARM Cortex-M0, or Cortex-M3/4/7, to a rich feature system running on MIPS32, ARM Cortex-A8, ARM Cortex-A9 DualCore etc.
Official website: rt-thread.org.
Source code: rt-thread.
I find the RT-Thread operating system a perfect candidate to be the first embedded system checked with PVS-Studio.
I glanced through the analysis report by PVS-Studio and picked 95 warnings that I thought to be the most interesting ones. Too see all those messages for yourself, download the rt-thread-html-log.zip archive with the full HTML report. We introduced this format not so long ago, and some users may not know about it. So, I'd like to use this opportunity to explain it once again. This is what this report looks like when opened in Firefox (click on the image to enlarge):
Its layout resembles that of HTML reports generated by the Clang analyzer. It stores snippets of the source-code so that you can know right away what places in the code the warnings refer to. This is what you see when select one warning (click on the image to enlarge):
There's no point discussing all the 95 warnings in this article since many of them look alike. I'll discuss only 14 code fragments that I found worth mentioning for one reason or another.
Note. I could have well missed some warnings pointing at critical bugs. That's why the RT-Thread developers should check the project themselves rather than rely solely on my report with those 95 warnings. I also suspect that we failed to figure out all the intricacies of RT-Thread and checked only a part of it.
void SEMC_GetDefaultConfig(semc_config_t *config)
{
assert(config);
semc_axi_queueweight_t queueWeight; /*!< AXI queue weight. */
semc_queuea_weight_t queueaWeight;
semc_queueb_weight_t queuebWeight;
....
config->queueWeight.queueaWeight = &queueaWeight;
config->queueWeight.queuebWeight = &queuebWeight;
}
PVS-Studio diagnostic message: V506 CWE-562 Pointer to local variable 'queuebWeight' is stored outside the scope of this variable. Such a pointer will become invalid. fsl_semc.c 257
The function writes the addresses of two local variables (queueaWeight and queuebWeight) to an external structure. When control leaves the function, the variables will cease to exist but the structure will still be keeping and using the pointers to those no longer existing objects. In fact, the pointers refer to some area on the stack that may store just anything. This is a highly unpleasant security issue.
PVS-Studio reports only the last suspicious assignment, which has to do with some specifics of its inner algorithms. However, if you remove or fix the last assignment, the analyzer will be reporting the first one.
#define CAN_FIFO0 ((uint8_t)0x00U) /*!< receive FIFO0 */
#define CAN_FIFO1 ((uint8_t)0x01U) /*!< receive FIFO1 */
uint8_t can_receive_message_length(uint32_t can_periph,
uint8_t fifo_number)
{
uint8_t val = 0U;
if(CAN_FIFO0 == fifo_number){
val = (uint8_t)(CAN_RFIFO0(can_periph) & CAN_RFIFO_RFL0_MASK);
}else if(CAN_FIFO0 == fifo_number){
val = (uint8_t)(CAN_RFIFO1(can_periph) & CAN_RFIFO_RFL0_MASK);
}else{
/* illegal parameter */
}
return val;
}
PVS-Studio diagnostic message: V517 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 525, 527. gd32f4xx_can.c 525
If the fifo_number argument is not equal to CAN_FIFO0, the function returns 0 all the time. It looks like the code was written using Copy-Paste and the programmer forgot to change the CAN_FIFO0 constant to CAN_FIFO1 in the cloned fragment.
#define PECI_M0D0C_HITHR_M 0xFFFF0000 // High Threshold
#define PECI_M0D0C_LOTHR_M 0x0000FFFF // Low Threshold
#define PECI_M0D0C_HITHR_S 16
#define PECI_M0D0C_LOTHR_S 0
void
PECIDomainConfigGet(....)
{
unsigned long ulTemp;
....
ulTemp = HWREG(ulBase + PECI_O_M0D0C + (ulDomain * 4));
*pulHigh =
((ulTemp && PECI_M0D0C_HITHR_M) >> PECI_M0D0C_HITHR_S);
*pulLow =
((ulTemp && PECI_M0D0C_LOTHR_M) >> PECI_M0D0C_LOTHR_S);
}
PVS-Studio diagnostic messages:
Here we have two disappointing typos: the programmer used the && operator instead of & twice.
Because of this, the pulHigh variable will always be assigned the value 0, while the pulLow variable will be assigned 0 or 1, which is obviously not what the programmer meant this code to do.
Note for those new to the C language. The (ulTemp && PECI_M0D0C_xxxxx_M) expression always evaluates either to 0 or 1. This value, 0 or 1, is then shifted to the right. Right-shifting the value 0/1 by 16 bits will always produce 0; shifting by 0 bits will still produce 0 or 1.
typedef enum _aipstz_peripheral_access_control {
kAIPSTZ_PeripheralAllowUntrustedMaster = 1U,
kAIPSTZ_PeripheralWriteProtected = (1U < 1),
kAIPSTZ_PeripheralRequireSupervisor = (1U < 2),
kAIPSTZ_PeripheralAllowBufferedWrite = (1U < 2)
} aipstz_peripheral_access_control_t;
PVS-Studio diagnostic messages:
The named constants were meant to be the powers of two and store the following values: 1, 2, 4, 4. But the programmer wrote the < operator instead of << by mistake, which resulted in the following values:
static int ft5x06_dump(void)
{
uint8_t i;
uint8_t reg_value;
DEBUG_PRINTF("[FTS] Touch Chip\r\n");
for (i = 0; i <= 255; i++)
{
_ft5x06_read(i, ®_value, 1);
if (i % 8 == 7)
DEBUG_PRINTF("0x%02X = 0x%02X\r\n", i, reg_value);
else
DEBUG_PRINTF("0x%02X = 0x%02X ", i, reg_value);
}
DEBUG_PRINTF("\n");
return 0;
}
PVS-Studio diagnostic message: V654 CWE-834 The condition 'i <= 255' of loop is always true. drv_ft5x06.c 160
Variables of type uint8_t can store values within the range [0..255], so the i <= 255 condition is always true. This will make the loop constantly print the debugging data.
#define RT_CAN_MODE_NORMAL 0
#define RT_CAN_MODE_LISEN 1
#define RT_CAN_MODE_LOOPBACK 2
#define RT_CAN_MODE_LOOPBACKANLISEN 3
static rt_err_t control(struct rt_can_device *can,
int cmd, void *arg)
{
....
case RT_CAN_CMD_SET_MODE:
argval = (rt_uint32_t) arg;
if (argval != RT_CAN_MODE_NORMAL ||
argval != RT_CAN_MODE_LISEN ||
argval != RT_CAN_MODE_LOOPBACK ||
argval != RT_CAN_MODE_LOOPBACKANLISEN)
{
return RT_ERROR;
}
if (argval != can->config.mode)
{
can->config.mode = argval;
return bxcan_set_mode(pbxcan->reg, argval);
}
break;
....
}
PVS-Studio diagnostic message: V547 CWE-571 Expression is always true. Probably the '&&' operator should be used here. bxcan.c 1171
The RT_CAN_CMD_SET_MODE case is never processed properly because a condition of the (x !=0 || x != 1 || x != 2 || x != 3) pattern is always true. We must be dealing with just another typo and the programmer actually meant the following:
if (argval != RT_CAN_MODE_NORMAL &&
argval != RT_CAN_MODE_LISEN &&
argval != RT_CAN_MODE_LOOPBACK &&
argval != RT_CAN_MODE_LOOPBACKANLISEN)
void MCAN_SetSTDFilterElement(CAN_Type *base,
const mcan_frame_filter_config_t *config,
const mcan_std_filter_element_config_t *filter,
uint8_t idx)
{
uint8_t *elementAddress = 0;
elementAddress = (uint8_t *)(MCAN_GetMsgRAMBase(base) +
config->address + idx * 4U);
memcpy(elementAddress, filter, sizeof(filter));
}
The analyzer reports the error with two warnings at once:
Rather than copying the whole structure of type mcan_std_filter_element_config_t, the memcpy function copies just a part of it the size of a pointer.
There are also errors dealing with pointer dereferencing before null checks to be found in the code of RT-Thread. This is a very common bug.
static rt_size_t rt_sdcard_read(rt_device_t dev,
rt_off_t pos,
void *buffer,
rt_size_t size)
{
int i, addr;
struct dfs_partition *part =
(struct dfs_partition *)dev->user_data;
if (dev == RT_NULL)
{
rt_set_errno(-EINVAL);
return 0;
}
....
}
PVS-Studio diagnostic message: V595 CWE-476 The 'dev' pointer was utilized before it was verified against nullptr. Check lines: 497, 499. sdcard.c 497
static void enet_default_init(void)
{
....
reg_value = ENET_DMA_BCTL;
reg_value &= DMA_BCTL_MASK;
reg_value = ENET_ADDRESS_ALIGN_ENABLE
|ENET_ARBITRATION_RXTX_2_1
|ENET_RXDP_32BEAT |ENET_PGBL_32BEAT
|ENET_RXTX_DIFFERENT_PGBL
|ENET_FIXED_BURST_ENABLE |ENET_MIXED_BURST_DISABLE
|ENET_NORMAL_DESCRIPTOR;
ENET_DMA_BCTL = reg_value;
....
}
PVS-Studio diagnostic message: V519 CWE-563 The 'reg_value' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 3427, 3428. gd32f4xx_enet.c 3428
The reg_value = ENET_ADDRESS_ALIGN_ENABLE|.... assignment overwrites the previous value of the reg_value variable, which is strange because the variable stores the results of meaningful calculations. The code should probably look as follows:
reg_value = ENET_DMA_BCTL;
reg_value &= DMA_BCTL_MASK;
reg_value |= ENET_ADDRESS_ALIGN_ENABLE
|ENET_ARBITRATION_RXTX_2_1
|ENET_RXDP_32BEAT |ENET_PGBL_32BEAT
|ENET_RXTX_DIFFERENT_PGBL
|ENET_FIXED_BURST_ENABLE |ENET_MIXED_BURST_DISABLE
|ENET_NORMAL_DESCRIPTOR;
typedef union _dcp_hash_block
{
uint32_t w[DCP_HASH_BLOCK_SIZE / 4];
uint8_t b[DCP_HASH_BLOCK_SIZE];
} dcp_hash_block_t;
typedef struct _dcp_hash_ctx_internal
{
dcp_hash_block_t blk;
....
} dcp_hash_ctx_internal_t;
status_t DCP_HASH_Init(DCP_Type *base, dcp_handle_t *handle,
dcp_hash_ctx_t *ctx, dcp_hash_algo_t algo)
{
....
dcp_hash_ctx_internal_t *ctxInternal;
....
for (i = 0; i < sizeof(ctxInternal->blk.w) /
sizeof(ctxInternal->blk.w[0]); i++)
{
ctxInternal->blk.w[0] = 0u;
}
....
}
PVS-Studio diagnostic message: V767 Suspicious access to element of 'w' array by a constant index inside a loop. fsl_dcp.c 946
The analyzer failed to associate this warning with any CWE ID, but it is, in fact, CWE-665: Improper Initialization.
In the loop, the value 0 is written to the 0-th element of the array all the time, while all the rest elements remain uninitialized.
static void at91_mci_init_dma_read(struct at91_mci *mci)
{
rt_uint8_t i;
....
for (i = 0; i < 1; i++)
{
/* Check to see if this needs filling */
if (i == 0)
{
if (at91_mci_read(AT91_PDC_RCR) != 0)
{
mci_dbg("Transfer active in current\n");
continue;
}
}
else {
if (at91_mci_read(AT91_PDC_RNCR) != 0)
{
mci_dbg("Transfer active in next\n");
continue;
}
}
length = data->blksize * data->blks;
mci_dbg("dma address = %08X, length = %d\n",
data->buf, length);
if (i == 0)
{
at91_mci_write(AT91_PDC_RPR, (rt_uint32_t)(data->buf));
at91_mci_write(AT91_PDC_RCR, .....);
}
else
{
at91_mci_write(AT91_PDC_RNPR, (rt_uint32_t)(data->buf));
at91_mci_write(AT91_PDC_RNCR, .....);
}
}
....
}
PVS-Studio diagnostic messages:
The loop body is executed exactly once, which doesn't make sense. Why use a loop at all then?
Besides, since the i variable in the loop body remains equal to 0, some of the conditions are always true, while the rest part is never executed.
I guess the programmer actually wanted the loop body to execute twice but made a typo. The loop condition should probably look like this:
for (i = 0; i <= 1; i++)
This would make the function code meaningful.
Sorry for the large fragment of the function body cited below: I have to include it to prove that the k variable is really not initialized anywhere before the program reads from it.
void LCD_PutPixel (LCD_PANEL panel, uint32_t X_Left,
uint32_t Y_Up, LcdPixel_t color)
{
uint32_t k;
uint32_t * pWordData = NULL;
uint8_t* pByteData = NULL;
uint32_t bitOffset;
uint8_t* pByteSrc = (uint8_t*)&color;
uint8_t bpp = bits_per_pixel[lcd_config.lcd_bpp];
uint8_t bytes_per_pixel = bpp/8;
uint32_t start_bit;
if((X_Left >= lcd_hsize)||(Y_Up >= lcd_vsize))
return;
if(panel == LCD_PANEL_UPPER)
pWordData = (uint32_t*) LPC_LCD->UPBASE +
LCD_GetWordOffset(X_Left,Y_Up);
else
pWordData = (uint32_t*) LPC_LCD->LPBASE +
LCD_GetWordOffset(X_Left,Y_Up);
bitOffset = LCD_GetBitOffset(X_Left,Y_Up);
pByteData = (uint8_t*) pWordData;
pByteData += bitOffset/8;
start_bit = bitOffset%8;
if(bpp < 8)
{
uint8_t bit_pos = start_bit;
uint8_t bit_ofs = 0;
for(bit_ofs = 0;bit_ofs <bpp; bit_ofs++,bit_pos++)
{
*pByteData &= ~ (0x01 << bit_pos);
*pByteData |=
((*pByteSrc >> (k+bit_ofs)) & 0x01) << bit_pos; // <=
}
}
....
}
PVS-Studio diagnostic message: V614 CWE-457 Uninitialized variable 'k' used. lpc_lcd.c 510
The k variable is not initialized anywhere before being used in the expression:
*pByteData |= ((*pByteSrc >> (k+bit_ofs)) & 0x01) << bit_pos;
HAL_StatusTypeDef FMC_SDRAM_SendCommand(....)
{
....
/* wait until command is send */
while(HAL_IS_BIT_SET(Device->SDSR, FMC_SDSR_BUSY))
{
/* Check for the Timeout */
if(Timeout != HAL_MAX_DELAY)
{
if((Timeout == 0)||((HAL_GetTick() - tickstart) > Timeout))
{
return HAL_TIMEOUT;
}
}
return HAL_ERROR;
}
return HAL_OK;
}
PVS-Studio diagnostic message: V612 CWE-670 An unconditional 'return' within a loop. stm32f7xx_ll_fmc.c 1029
The loop body executes only once at most, which looks strange since it would make more sense to use an if statement to get the same behavior. There must be some logic error here.
As I already mentioned, this article covers only some of the bugs found. To see the complete list of the warnings I selected, see the HTML report (stored in the rt-thread-html-log.zip archive).
In addition to the issues that are bugs for sure, I also included the warnings pointing at suspicious code. These are the cases where I'm not sure if they are real bugs, but the RT-Thread developers should check that code out anyway. Here's just one example.
typedef unsigned long rt_uint32_t;
static rt_err_t lpc17xx_emac_init(rt_device_t dev)
{
....
rt_uint32_t regv, tout, id1, id2;
....
LPC_EMAC->MCFG = MCFG_CLK_DIV20 | MCFG_RES_MII;
for (tout = 100; tout; tout--);
LPC_EMAC->MCFG = MCFG_CLK_DIV20;
....
}
PVS-Studio diagnostic message: V529 CWE-670 Odd semicolon ';' after 'for' operator. emac.c 182
The programmer used the loop to introduce a small delay, which the analyzer, though indirectly, points out to us.
In the world of optimizing compilers that I'm used to, this would definitely be a bug. Compilers would simply delete this loop away to cut out any delay since tout is an ordinary, non-volatile variable. I don't know, however, if this is true for the world of embedded systems, but I still suspect that this code is incorrect or at least unreliable. Even if the compiler doesn't optimize such loops away, there's no knowing how long the delay will last and if it will be long enough.
As far as I know, such systems use functions like sleep_us, and it is them that one should use for small delays. The compiler could well turn a call to sleep_us into a regular simple loop, but these are just the specifics of the implementation. When written manually, however, such delaying loops may be dangerous, not to mention the bad style.
I encourage you to check projects for embedded systems that you develop. It's the first time that we have added support for the ARM compilers, so there might be some problems. So, please don't hesitate to contact our support if you have any questions or want to report an issue.
The demo version of PVS-Studio can be downloaded here.
We understand that many projects for embedded systems are too tiny to make it worth buying a license, so we provide a free license, which is explained in the article "How to use PVS-Studio for Free". The great advantage of our version of the free license is that you can use it not only in open-source projects but in proprietary ones as well.
Thanks for reading, and may your robots stay bugless!
This article will draw a new audience, so if you haven't heard of the PVS-Studio analyzer before, you may want to check out the following articles: