In this article, I'm going to tell you about my experience of analyzing the Octave project. It is quite a popular one, especially among students who need to scan their math task solutions yet don't feel like buying a Matlab license.
Octave is a numerical computations system and the major open-source alternative to Matlab. It is capable of solving a variety of tasks such as matrix handling, differential equation solving, or function integration over infinite intervals. Octave's functionality is pretty wide and can be extended even more through dynamically linking plugins written in C, C++, or Fortran. Octave itself is written in C++ and has a high-level C++-like language of its own whose syntax resembles that of MATLAB, which makes properly written scripts compatible not only with Octave but MATLAB as well. Octave can be used in the GUI mode provided by the Qt library.
The system was built with MinGW. Since it was my first experience of working with the tool, I faced some problems when compiling the project. Octave has quite a number of dependencies, so it took me a lot of time to install the necessary third-party libraries. While building the project, I had to rebuild and update some of the libraries I had already had by the moment. For example, the BLAS library I had found on my computer turned out to be incompatible with the settings of the gfortran compiler coming with MinGW.
The analysis was done with the PVS-Studio static code analyzer (or more exactly, with its Standalone unit) which provides a special monitoring feature allowing you to "intercept" compiler calls and generate preprocessed files for further analysis. To launch the analysis, you just need to click on the "Analyze your files" button in PVS-Studio and run the make command in the console to start the building process.
The major part of the warnings generated by the analyzer dealt with constructs aiming at making the code crossplatform. PVS-Studio provides a special feature to filter diagnostic messages by error types or severity levels and mark a group of warnings (triggered by crossplatform solutions in my case) as false positives, so it enabled me to view and handle relevant errors only. For the sake of convenience, I grouped all the warnings into three categories: incorrect pointer handling, incorrect conditions, and all the others. Now let's take a look at some examples of the errors found in the Octave project.
PVS-Studio's diagnostic message: V507 Pointer to local array 'dirbuf' is stored outside the scope of this array. Such a pointer will become invalid. tmpdir.c 128
#define PATH_MAX 260
int path_search(const char *dir, ....)
{
....
if (....)
{
char dirbuf[PATH_MAX];
....
dir = dirbuf;
}
....
dlen = strlen (dir);
}
In this example, the address of the local buffer "dirbuf", whose lifetime corresponds to the "if" block body, is saved into a pointer passed into the function. After leaving the "if" block, the pointer to the destroyed array is used to get a string length in the "strlen" function. Since the memory area pointed to by the "dir" pointer is not valid anymore, these actions are illegal.
PVS-Studio's diagnostic message: V595 The 'Pinv' pointer was utilized before it was verified against nullptr. Check lines: 66, 79. colamd.cc 66
static void symetree(const octave_idx_type *ridx,
octave_idx_type *P, ....)
{
....
for (octave_idx_type k = 0 ; k < n ; k++)
Pinv[P[k]] = k;
....
octave_idx_type i = (Pinv) ? (Pinv[ridx[p]]) : (ridx[p]);
....
}
In this fragment, the programmer forgot to check the "Pinv" pointer for being null before using it for the first time. However, it is checked in the body of the ternary operator. Since the pointer hasn't been changed in any way, it will naturally stay non-null. If it originally were null, we'd get an error message the very first time we tried to use it. I don't see why the programmer would need to handle the pointer in a way like that.
PVS-Studio's diagnostic message: V668 There is no sense in testing the 'instance' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. oct-spparms.cc 45
octave_sparse_params *octave_sparse_params::instance = 0;
bool octave_sparse_params::instance_ok(void)
{
....
instance = new octave_sparse_params();
if (instance)
....
}
This code contains an excessive check. This check can be found after each use of the "new" operator throughout the code for a few dozens of times. As you know, all modern compilers are forced by the standard to generate a "bad_alloc" exception if the "new" operator fails to allocate memory. But it hasn't always been like that. For example, an already obsolete compiler VC6 would return "NULL" instead of throwing the exception, which contradicts the standard. Nowadays, however, such checks are just vestige of the past and are no more necessary.
PVS-Studio's diagnostic message: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 1956, 1962. cellfun.cc 1956
DEFUN(....)
{
....
octave_value array = args(0);
....
if (....)
else if (array.is_object())
retval = do_object2cell(array, dimv);
else if (array.is_map())
retval = do_num2cell(array.map_value (), dimv);
else if (array.is_cell())
retval = do_num2cell(array.cell_value (), dimv);
else if (array.is_object())
retval = do_num2cell(array.cell_value (), dimv);
....
}
In this code, the analyzer has detected a duplicated condition in the if {} else if {} construct. I can't say for sure which method should be used instead of the second call "array.is_object" as there are a lot of methods like that in the octave_value class. The fact that the same function call is used in the body of the duplicated condition as in if (array.is_cell()) looks pretty suspicious too.
PVS-Studio's diagnostic message: V560 A part of conditional expression is always true: brace_level == 0. kpse.cc 490
class kpse_path_iterator
{
....
private:
size_t e;
size_t len;
void set_end(void)
{
....
int brace_level = 0;
while (e < len && !(brace_level == 0 && ...))
e++;
....
}
....
}
The "while" condition in the class method contains an excessive check brace_level == 0. This condition is executed all the time since the "brace_level" variable was initialized to zero before the loop and hasn't changed during its execution. Perhaps the loop body used to contain some operations over the "brace_level" variable once, which were removed later, while the programmer forgot to fix the condition accordingly. But it's just my guess.
PVS-Studio's diagnostic message: V560 A part of conditional expression is always true: !error_state. load-save.cc 403
octave_value do_load(std::istream& stream, ....)
{
....
std::string name;
....
if (error_state || stream.eof() || name.empty())
break;
else if (!error_state && !name.empty())
{
....
}
....
}
In this construct, we can only get into the "else" branch when each of the conditions "error_state", "stream.eof()", and "name.empty()" is false. If at least one of them is true, the "if" block will be executed. So, getting into the "else" block ensures that the conditions "error_state" and "name.empty()" will be false, which means the second check is not necessary.
PVS-Studio's diagnostic message: V571 Recurring check. The 'nargin > 0' condition was already verified in line 51. __dispatch__.cc 53
DEFUN(....)
{
int nargin = args.length();
....
if (nargin > 0 && nargin < 4)
{
if (nargin > 0)
....
}
....
}
In this example, we are dealing with a similar issue with an excessive check of the "nargin" variable. Excessive checks are not errors and don't affect the program performance too much, especially when located outside loop bodies, but they do make the code a bit more cumbersome and less comprehensible.
PVS-Studio's diagnostic message: V663 Infinite loop is possible. The 'cin.eof()' condition is insufficient to break from the loop. Consider adding the 'cin.fail()' function call to the conditional expression. ls-mat-ascii.cc 75
static std::string get_mat_data_input_line(std::istream& is)
{
....
do
{
while (is.get(c))
....
}
while (!(have_data || is.eof()));
....
}
In this case, the loop termination condition may never be executed. If data are improperly read from the "is" stream, the "is.fail()" flag will be set, while the "is.eof()" flag will remain unchanged and the function will keep working with incorrect data. A correct version of the loop termination condition should look as follows:
while (!(have_data || is.eof() || is.fail()));
PVS-Studio's diagnostic message: V519 The 'x_normrender' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 5521, 5621. graphics.cc 5621
void axes::properties::update_camera(void)
{
....
Matrix x_normrender = xform_matrix();
....
x_normrender = x_viewport * x_projection * x_view;
....
}
It's really pretty strange that the result of the first assignment of the "x_normrender" variable is not used anywhere and is later replaced by a multiplication of two parameters. In the body of the "xform_matrix()" function, a constructor is used to create a matrix object and a small loop to fill it. These operations may slow down the program performance at this code fragment. The compiler might notice that the function result is not used anywhere and remove the call, but, as they say, the compiler helps those who help themselves.
PVS-Studio's diagnostic message: V523 The 'then' statement is equivalent to the 'else' statement. matrix_type.cc 312
DEFUN(....)
{
....
if (str_typ == "upper")
mattyp.mark_as_permuted(len, p);
else
mattyp.mark_as_permuted(len, p);
....
}
I don't think I need to remind you that one should be very careful using the copy-paste technique when writing similarly looking constructs to avoid errors like this. I strongly doubt there should be two identical statements in the if-else block; a much more likely thing is that the "mark_as_unpermuted" method should be called when the else branch is triggered. Moreover, the programmer copied a part of code containing this buggy block to use in one more function later in the code, thus duplicating the error.
This second construct can be found in the following fragment:
V523 The 'then' statement is equivalent to the 'else' statement. matrix_type.cc 485
The next warning.
PVS-Studio's diagnostic message: V570 The 'i' variable is assigned to itself. sparse.cc 144
template <class T>
void Sparse<T>::SparseRep::maybe_compress(bool remove_zeros)
{
....
octave_idx_type i = 0;
for (octave_idx_type j = 1; j <= ncols; j++)
{
octave_idx_type u = c[j];
for (i = i; i < u; i++)
if (d[i] != T())
{
d[k] = d[i];
r[k++] = r[i];
}
c[j] = k;
}
....
}
An error like this is pretty hard to detect, especially when just quickly glancing through the code as these loops use a lot of single-letter variables. I intentionally didn't single out the code fragment with the error so that you could see for yourself how difficult it is to figure out anything in the code with variable names like those. Attentive readers have already spotted a strange variable assignment in the initializing part of the for (i = i; i < u; i++) loop. I can't say for sure if the programmer wanted to assign value "j" or 1 to the "i" variable as these characters look pretty much alike.
Interestingly, this construct is repeated 800 lines later with the "d" and "k" variables having different names and with a bit different conditions yet with the same error.
To sum it up, I'd like to say that I found Octave's code pretty high-quality. Most of the suspicious fragments detected by the analyzer deal with crossplatform solutions. I didn't mention in this article certain warnings such as using classes without an overloaded assignment operator, using global variables with short names, and so on. These are low-level warnings, which are not really errors and should only be treated as recommendations for developers. So, I'd like to compliment Octave's authors, for it has pretty few errors for a project of a size like that. It is perhaps due to its considerable age. However, PVS-Studio still has managed to find some interesting defects. So welcome to try it for free on your own project: http://www.viva64.com/en/pvs-studio/download/
0