In this article, we'll discuss how the PVS-Studio static analyzer encourages developers to refactor their code. After all, the shorter, simpler, and clearer the code is, the fewer errors it contains.
In the previous article — "Let's check the qdEngine game engine, part one: top 10 warnings issued by PVS-Studio" — we've already looked at an interesting example of code refactoring that was encouraged by the PVS-Studio warning message (see the tenth code fragment). Let's look at some more examples of how we can enhance the code using a static analyzer.
class qdCamera
{
....
sGridCell *Grid;
....
};
bool qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
if (Grid)
delete[] Grid;
Grid = new sGridCell[sx * sy];
if (!Grid)
return false;
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
return true;
}
Here we have a function that reallocates a buffer of the required size. The analyzer spots this code because it detects a meaningless check of the pointer returned by the operator new[]. I'm talking about the following lines:
Grid = new sGridCell[sx * sy];
if (!Grid)
return false;
The PVS-Studio warning: V668 There is no sense in testing the 'Grid' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. qd_camera.cpp 910
Indeed, if there isn't enough memory to create an array, the operator new[] throws an exception of the std::bad_alloc type. You can get a null pointer only using new (std::nothrow) [], but this isn't our case.
Here's a function where an unnecessary check has been removed:
bool qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
if (Grid)
delete[] Grid;
Grid = new sGridCell[sx * sy];
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
return true;
}
Now the restore function always returns true. It either throws an exception or works properly. Returning a success status (true or false) no longer makes sense. We can simplify the function by removing the return value:
void qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
if (Grid)
delete[] Grid;
Grid = new sGridCell[sx * sy];
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
As a bonus, we no longer need to check what the function returned in other code fragments.
What else can we enhance? There's no point in checking the pointer before calling the operator delete[]. The operator handles null pointers correctly (in this case, it does nothing).
Someone may argue that such a check is a micro-optimization, because if the pointer is null, you don't need to call the operator. That's nonsense. The function call time can be neglected when there are heavy operations such as memory allocation and deallocation. I've covered the topic in the following article: "Shall we check pointer for NULL before calling free function?" As you can see from the title, the article is about the free function, but it's the same with the operator delete.
The code without unnecessary checks:
void qdCamera::restore(sGridCell* grid,
int sx, int sy,
int csx, int csy)
{
delete[] Grid;
Grid = new sGridCell[sx*sy];
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
We've significantly shortened the code without losing anything. Unfortunately, both the old and the new code still have a weakness:
delete[] Grid;
Grid = new sGridCell[sx * sy];
....
qdCamera::~qdCamera()
{
if (GSX)
{
delete[] Grid;
}
}
If an exception occurs when the operator new is called, the stack unwinding and destruction of various objects begin. The destructor of the qdCamera object is also called. And the Grid pointer may not be null, but it's no longer valid. The array is destroyed just before the operator new is called. This results in undefined behavior.
The simplest fix is to add pointer zeroing:
delete[] Grid;
Grid = nullptr;
Grid = new sGridCell[sx * sy];
That's a bad fix, though. We're wasting our energy on a lost cause. The main issue in the code is manual memory management. It causes programmers to write long, buggy code.
The correct solution is to use a smart pointer or std::vector. Let's start with the std::unique_ptr smart pointer, as std::vector isn't always practical and efficient.
class qdCamera
{
....
std::unique_ptr<sGridCell[]> Grid;
~qdCamera() = default;
....
};
void qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
Grid = std::make_unique<sGridCell[]>(sx * sy);
std::copy(grid, grid + sx * sy, Grid.get());
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
This is much better:
Finally, here's an example of using std::vector.
class qdCamera
{
....
std::vector<sGridCell> Grid;
~qdCamera() = default;
....
};
void qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
Grid.clear();
Grid.reserve(sx * sy);
std::copy(grid, grid + sx * sy, std::back_inserter(Grid));
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
However, if you switch to containers during refactoring, the interface of many functions changes too. This is what the restore function is likely to look like:
void qdCamera::restore(const std::vector<sGridCell> &grid,
int sx, int sy,
int csx, int csy)
{
Grid = grid;
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
Although, I'm getting way off the point of the story. Obviously, during extensive refactoring, the code may become completely different. The restore function, for example, may disappear or transform into something else.
The main idea is that the PVS-Studio analyzer warnings help you write shorter and safer code.
In the arButton class, manual memory management is even more apparent than in the previous case. The class creates copies of strings and can return pointers to them.
There are string copy functions:
class arButton
{
....
void set_obj(const char* p)
{ if(obj_name) free(obj_name);
obj_name = strdup(p); }
void set_obj_regvalue(const char* p)
{ if(obj_name_regvalue) free(obj_name_regvalue);
obj_name_regvalue = strdup(p); }
void set_cmdline(const char* p)
{ if(cmd_line) free(cmd_line); cmd_line = strdup(p); }
void set_regkey(const char* p)
{ if(reg_key) free(reg_key); reg_key = strdup(p); }
void set_reg_exec_path(const char* p)
{ if(reg_exec_path_value) free(reg_exec_path_value);
reg_exec_path_value = strdup(p); }
void set_checkstr(const char* p)
{ if(check_after_exec) free(check_after_exec);
check_after_exec = strdup(p); }
....
char *obj_name;
char *obj_name_regvalue;
char *cmd_line;
char *reg_key;
char *reg_exec_path_value;
char *check_after_exec;
};
Each string has a its own getter, but we don't care about that for now. Here's just one example:
char* get_obj(void) { return obj_name; }
The analyzer issues six similar warnings about potential memory leaks for this class. Here's just one such warning from PVS-Studio, so as not to make the article too long: V773 The 'obj_name' pointer was not released in destructor. A memory leak is possible. ar_button.cpp 50
The thing is, the class destructor doesn't do anything.
arButton::~arButton(void)
{
}
Strings created using strdup aren't destroyed in set_* functions. Memory leaks are possible.
We could fix the code by refining the destructor, but would we really want to take that intermediate step? The point is that we not only want to fix the code, we'd like to enhance it right away. Let's look at one of the set functions again:
void set_obj(const char* p)
{ if(obj_name) free(obj_name);
obj_name = strdup(p); }
Here are the weaknesses in the code:
We can rewrite the code using smart pointers. However, this is pointless, as the best way is to use std::string. After all, the program is written in C++, not in C.
class arButton
{
....
void set_obj(const char* p) { obj_name = p; }
void set_obj_regvalue(const char* p) { obj_name_regvalue = p; }
void set_cmdline(const char* p) { cmd_line = p; }
void set_regkey(const char* p) { reg_key = p; }
void set_reg_exec_path(const char* p) { reg_exec_path_value = p; }
void set_checkstr(const char* p) { _after_exec = p; }
....
std::string obj_name;
std::string obj_name_regvalue;
std::string cmd_line;
std::string reg_key;
std::string reg_exec_path_value;
std::string check_after_exec;
};
Depending on the refactoring extent, the getter functions can look like this:
const char* get_obj(void) const { return obj_name.c_str(); }
Or like this:
const std::string& get_obj(void) const { return obj_name; }
Let's get back to the set functions. They're not "good" yet, but without external context, it's hard to say what's the best way to refine them. The functions have the following flaws so far:
Here's an abstract option for a possible enhancement:
....
void set_cmdline(std::string s) noexcept { cmd_line = std::move(s); }
void set_regkey(std::string s) noexcept { reg_key = std::move(s); }
....
Let's leave it at that, otherwise the question will follow: do we really need a layer of get/set functions? The code may need even more extensive refactoring, so we may end up with a completely different class structure.
Some readers may be a bit puzzled and thinking:
"What was that all about? We already know that manual memory management is bad, and it causes errors. Why would you mention the analyzer? We don't need it to see that the shown code needs to be rewritten".
I agree, but please note the following points:
Thank you for reading! Download PVS-Studio to enhance your code.