Nowadays a lot of projects are opening their source code and letting those who are interested in the development of it edit the code. We'll check one such project - OpenJDK and help the developers improve the code.
OpenJDK (Open Java Development Kit) - a project for the creation and implementation of Java (Java SE) platform, which is now free and open source. The project was started in 2006, by the Sun company. The project uses multiple languages- C, C++, and Java. We are interested in the source code written in C and C++. Let's take the 9th version of OpenJDK. The code of this implementation of Java platform is available at the Mercurial repository.
The project was scanned with PVS-Studio static code analyzer. It has plenty of diagnostic rules, helping it to find a large number of errors in the code, and is also able to find those that are hard to detect during simple code reviews. Some of these errors do not affect the logic of the program, and some can lead to sad consequences during the program execution. There are various examples of errors on the analyzers website, which were found in other open-source projects. This tool is able to analyze projects that are written in C, C++, and C#. The trial version of the analyzer can be downloaded through this link.
First let's take a look at errors in logical expressions:
int StubAssembler::call_RT(....) {
#ifdef _LP64
// if there is any conflict use the stack
if (arg1 == c_rarg2 || arg1 == c_rarg3 ||
arg2 == c_rarg1 || arg1 == c_rarg3 ||
arg3 == c_rarg1 || arg1 == c_rarg2) {
....
}
PVS-Studio warning: V501 There are identical sub-expressions 'arg1 == c_rarg3' to the left and to the right of the '||' operator. c1_Runtime1_x86.cpp 174
The analyzer tells us about the duplication of arg1 == c_rarg3 check. There is a redundant check here, or worse, a logical error. Perhaps something else should be checked instead of the duplicated condition. This code is worth revising for sure.
There is one more recurring expression arg1 == c_rarg2: in the same condition.
PVS-Studio warning: V501 There are identical sub-expressions 'arg1 == c_rarg2' to the left and to the right of the '||' operator. c1_Runtime1_x86.cpp 174
These warnings are great proof of the usefulness of the analyzer. It's very simple to make a mistake in a large number of similar expressions; and they're hard to notice during the visual code review.
In the next fragment we have a "nonideal" check in the condition of the Ideal method:
Node *AddLNode::Ideal(PhaseGVN *phase, bool can_reshape) {
....
if( op2 == Op_AddL &&
in2->in(1) == in1 &&
op1 != Op_ConL &&
0 ) {
....
}
PVS-Studio warning: V560 A part of conditional expression is always false: 0. addnode.cpp 435
It's quite odd to use 0 in a logical expression. Most likely, this code is still in the process of development, and to debug it, this condition was not made executable. The necessary comments are missing in the code, and chances are, in the future will be forgotten. This bug can result in everything inside of this condition l never being executed, and as a result of the logical expression, evaluation is always false.
Quite often, programmers put too much faith in their knowledge of precedence, and don't enclose the component parts of the parentheses of a complex expression:
int method_size() const
{ return sizeof(Method)/wordSize + is_native() ? 2 : 0; }
PVS-Studio warning: V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '+' operator. method.hpp 249
In this case I don't know the specifics of the code, but I have a suspicion, that it was intended to choose a value '2' or '0' depending on the result of the is_native() function call, but the expression has a different evaluation order. First there will be addition - sizeof(Method)/wordSize + is_native(), and then we'll have the result 0 or 2 returned, i.e. the code was probably meant to be like this:
{ return sizeof(Method)/wordSize + (is_native() ? 2 : 0); }
This is a very common error with operation precedence. In the error base of the analyzer we have found the most popular ones, and put them in an article: Logical Expressions in C/C++. Mistakes Made by Professionals.
The next group of errors is caused by copying the code. There is no way to work around this favorite method of programmers, so let's investigate the fragments where we have it:
static int
setImageHints(....)
{
....
if (dstCMP->isDefaultCompatCM) {
hintP->allocDefaultDst = FALSE;
hintP->cvtToDst = FALSE;
}
else if (dstCMP->isDefaultCompatCM) {
hintP->allocDefaultDst = FALSE;
hintP->cvtToDst = FALSE;
}
....
}
PVS-Studio warning: V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 1873, 1877. awt_ImagingLib.c 1873
In this fragment the conditions are the same in if and else if, as well as the code that should be executed. The second condition is completely pointless, as it will never be executed.
Another similar case:
static int expandPackedBCR(JNIEnv *env, RasterS_t *rasterP,
int component,
unsigned char *outDataP)
{
....
/* Convert the all bands */
if (rasterP->numBands < 4) {
/* Need to put in alpha */
for (y=0; y < rasterP->height; y++) {
inP = lineInP;
for (x=0; x < rasterP->width; x++) {
for (c=0; c < rasterP->numBands; c++) {
*outP++ = (unsigned char)
(((*inP&rasterP->sppsm.maskArray[c]) >> roff[c])
<<loff[c]);
}
inP++;
}
lineInP += rasterP->scanlineStride;
}
}
else {
for (y=0; y < rasterP->height; y++) {
inP = lineInP;
for (x=0; x < rasterP->width; x++) {
for (c=0; c < rasterP->numBands; c++) {
*outP++ = (unsigned char)
(((*inP&rasterP->sppsm.maskArray[c]) >> roff[c])
<<loff[c]);
}
inP++;
}
lineInP += rasterP->scanlineStride;
}
}
....
}
PVS-Studio warning: V523 The 'then' statement is equivalent to the 'else' statement. awt_ImagingLib.c 2927
The executable code is identical in both blocks, so there is no difference, in what is evaluated in the condition. It makes sense to look at this fragment, and remove the unnecessary branch, or correct the code if a different logic was intended to be here, to avoid duplication.
Two more fragments with identical duplication. I'll just show them here without citing the code:
And the last interesting case, caused by the copy paste error:
Node* GraphKit::record_profiled_receiver_for_speculation(Node* n)
{
....
ciKlass* exact_kls = profile_has_unique_klass();
bool maybe_null = true;
if (java_bc() == Bytecodes::_checkcast ||
java_bc() == Bytecodes::_instanceof ||
java_bc() == Bytecodes::_aastore) {
ciProfileData* data =
method()->method_data()->bci_to_data(bci());
bool maybe_null = data == NULL ? true : <==
data->as_BitData()->null_seen();
}
return record_profile_for_speculation(n,
exact_kls, maybe_null);
return n;
}
PVS-Studio warning: V561 It's probably better to assign value to 'maybe_null' variable than to declare it anew. Previous declaration: graphKit.cpp, line 2170. graphKit.cpp 2175
What is going on in this code? A variable bool maybe_null = true; is declared before the if block. Then, when the code in the if clock is executed, a variable with the same name is declared. After the block is exited, the value of this variable will be lost, and the function call, using this variable, will always be true. It's good, if the variable was duplicated for the sake of debugging. Otherwise, this code is executed incorrectly and requires modification:
maybe_null = data == NULL ? true :
data->as_BitData()->null_seen();
A programmer should be very careful, and especially attentive, when working with pointers; because during the pointer usage you may get errors that will be hard to detect later on. As a rule, the main danger is in using not valid pointers, or using pointers without verifying them against null.
Firstly, let's have a look at a case of an explicit use of a null pointer:
static jint JNICALL
cbObjectTagInstance(....)
{
ClassInstancesData *data;
/* Check data structure */
data = (ClassInstancesData*)user_data;
if (data == NULL) {
data->error = AGENT_ERROR_ILLEGAL_ARGUMENT;
return JVMTI_VISIT_ABORT;
}
....
}
PVS-Studio warning: V522 Dereferencing of the null pointer 'data' might take place. util.c 2424
Completely unclear code with a null pointer, can lead to a program crash. Perhaps this branch was never executed, that's why some problems were avoided. There were three more similar fragments in the same file:
But in the following cases the possibility of using a null pointer isn't so evident. This is a very common situation, such warnings are found in almost every project we check.
static jboolean
visibleClasses(PacketInputStream *in, PacketOutputStream *out)
{
....
else {
(void)outStream_writeInt(out, count);
for (i = 0; i < count; i++) {
jbyte tag;
jclass clazz;
clazz = classes[i]; <==
tag = referenceTypeTag(clazz);
(void)outStream_writeByte(out, tag);
(void)outStream_writeObjectRef(env, out, clazz);
}
}
if ( classes != NULL ) <==
jvmtiDeallocate(classes);
....
return JNI_TRUE;
}
PVS-Studio warning: V595 The 'classes' pointer was utilized before it was verified against nullptr. Check lines: 58, 66. ClassLoaderReferenceImpl.c 58
In the lower block the pointer is verified against null, so the programmer assumes that a pointer value may be null. But in the block above we see that the pointer is used without a check. Thus, if the pointer value is zero, this check won't help us, and the program will terminate. To fix this error, we should check the pointer that is above two blocks.
I'll give a similar example:
int InstructForm::needs_base_oop_edge(FormDict &globals) const {
if( is_simple_chain_rule(globals) ) {
const char *src = _matrule->_rChild->_opType;
OperandForm *src_op = globals[src]->is_operand();
assert( src_op, "Not operand class of chain rule" );
return src_op->_matrule ?
src_op->_matrule->needs_base_oop_edge() : 0;
} // Else check instruction
return _matrule ? _matrule->needs_base_oop_edge() : 0;
}
PVS-Studio warning: V595 The '_matrule' pointer was utilized before it was verified against nullptr. Check lines: 3534, 3540. formssel.cpp 3534
Here the pointer check is carried out below in the ternary operator - _matrule ? _matrule->needs_base_oop_edge() : 0;. Earlier in the code there is addressing the pointer - const char *src = _matrule->_rChild->_opType;. The recipe to correct it is the same: the pointer should be checked before it is used. There were quite a few of such spots, so I'll give them as a list here:
Sometimes programmers check the pointers, but do it in a wrong way.
FileBuff::FileBuff( BufferedFile *fptr, ArchDesc& archDesc) :
_fp(fptr), _AD(archDesc) {
....
_bigbuf = new char[_bufferSize];
if( !_bigbuf ) {
file_error(SEMERR, 0, "Buffer allocation failed\n");
exit(1);
....
}
PVS-Studio warning: V668 There is no sense in testing the '_bigbuf' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. filebuff.cpp 47
In this case the check of the verification of _bigbuf pointer against null after using the new operator is pointless. In the case of system not being able to allocate the memory, an exception will be thrown, and the function execution will be interrupted. We can use several approaches to fix this issue. We could allocate the memory in the try catch block, or use new(std::nothrow) construction, which won't throw exceptions in case of a failure. There are several more incorrect checks.
The last error that is related to the pointer handling occurred during the explicit casting of one type of pointer to another.
mlib_status mlib_convMxNext_f32(...)
{
mlib_d64 dspace[1024], *dsa = dspace;
....
mlib_f32 *fsa;
....
if (3 * wid_e + m > 1024) {
dsa = mlib_malloc((3 * wid_e + m) * sizeof(mlib_d64));
if (dsa == NULL)
return MLIB_FAILURE;
}
fsa = (mlib_f32 *) dsa; <==
....
}
PVS-Studio warning: V615 An odd explicit conversion from 'double *' type to 'float *' type. mlib_ImageConvMxN_Fp.c 294
A programmer tries to assign a pointer to float mlib_f32 *fsa with a pointer mlib_d64 dspace[1024], *dsa = dspace. But the types float and double have different sizes, so this type cast is most likely erroneous. The mismatch of the casted types causes the fsa to point to a digit number that is incorrect for the float type.
There are two more similar castings in another file, it would be a good thing to check this code, and use correct type castings.
At this point, let's stop looking at errors related to incorrect pointer handling, and let's move on to the other analyzer warnings.
The following bug is probably the result of an improper copy-paste:
static bool
parse_bool (const char **pp, const char *end, unsigned int *pv)
{
....
/* CSS allows on/off as aliases 1/0. */
if (*pp - p == 2 || 0 == strncmp (p, "on", 2))
*pv = 1;
else if (*pp - p == 3 || 0 == strncmp (p, "off", 2))
*pv = 0;
else
return false;
return true;
}
PVS-Studio warning: V666 Consider inspecting third argument of the function 'strncmp'. It is possible that the value does not correspond with the length of a string which was passed with the second argument. hb-shape.cc 104
Here is a case where a bug doesn't affect the working of the program. Instead of comparing three symbols, only the first two symbols get compared, but I do not conclude that the author of the code did not do this check deliberately. As the value in the p buffer can be on or off, then it's enough to compare first two symbols. But to make it clearer, we can correct the code:
else if (*pp - p == 3 || 0 == strncmp (p, "off", 3))
There were several more places
class ProductionState {
....
private:
// Disable public use of constructor, copy-ctor, ...
ProductionState( ) :
_production(cmpstr, hashstr, Form::arena)
{ assert( false, "NotImplemented"); };
ProductionState( const ProductionState & ) :
_production(cmpstr, hashstr, Form::arena)
{ assert( false, "NotImplemented"); }; // Deep-copy
};
PVS-Studio warning: V690 Copy constructor is declared as private in the 'ProductionState' class, but the default '=' operator will still be generated by compiler. It is dangerous to use such a class. dfa.cpp 76
In this class the programmer made an attempt to prohibit copying the code, but forgot to add a copy assignment operator to a private area. It will be generated by default, and will be available for use. Even if this operator isn't used anywhere in the code, there is a guarantee that it won't be called accidentally in the future. During the call of such an operator we'll have memberwise copying for a class that should not be copied. This can cause various effects, even a program crash. In this case we should add the declaration of the "=" operator to the private area.
There are two more classes where we see the same problems; it would be great to fix them in such a way so that the "The Law of The Big Two" isn't violated.
The latter looks like a simple typo.
bool os::start_debugging(char *buf, int buflen) {
int len = (int)strlen(buf);
char *p = &buf[len];
....
if (yes) {
// yes, user asked VM to launch debugger
jio_snprintf(buf, sizeof(buf), "gdb /proc/%d/exe %d",
os::current_process_id(), os::current_process_id());
os::fork_and_exec(buf);
yes = false;
}
return yes;
}
PVS-Studio warning: V579 The jio_snprintf function receives the pointer and its size as arguments. It is possibly a mistake. Inspect the second argument. os_linux.cpp 6094
A programmer wanted to pass a buffer size, but didn't take into account that it is not a locally declared array, but a pointer that comes in the function argument. In the result of evaluation of sizeof(buf) we'll get not the buffer size, but the pointer size, which will be either 4 or 8 bytes. This bug can be easily fixed, as the buffer length was already received earlier in the code: int len = (int)strlen(buf);. The correct variant will be as follows:
jio_snprintf(buf, len ....
It's always amusing to check a project which is used and maintained by a large number of people. We found a considerable number of errors; in this article we have described only a part of them, the rest require more thorough investigation. Those bugs we found, are further proof of the usefulness of an analyzer, as it allows the detection of such errors as would otherwise be hard to detect during simple code review. The most effective way is to use an analyzer on a regular basis, as it will save a lot of time that could be spent debugging the program. And again, I remind you, that you can try the analyzer on your project by downloading the trial version.
0