Andrey Karpov

Aug 31 2016

Tags:

#Cpp #Compiler

Bugs found in GCC with the help of PVS-Studio

Aug 31 2016

Author: Andrey Karpov

GCC
Foreseeing a discussion
PVS-Studio
Analysis results
Conclusion

I regularly check various open-source projects to demonstrate the abilities of the PVS-Studio static code analyzer (C, C++, C#). Now it is time for the GCC compiler to get checked. Unquestionably, GCC is a very qualitative and well-tested project, that's why it's already a great achievement for a tool to find any errors in it. Fortunately, PVS-Studio coped with this task. No one is immune to typos or carelessness. This is why the PVS-Studio can become an additional line of defense for you, on the front of the endless war against bugs.

GCC

GNU Compiler Collection (usually shortened to GCC) - is a set of compilers for different programming languages developed in the scope of the GNU project. GCC is free software, distributed by the free software foundation under the terms of the GNU GPL and GNU LGPL and is a key component of the GNU toolchain. The project is written in C and C++.

The GCC compiler has great built-in diagnostics, which help in the detection of many issues at the compilation stage. Of course, GCC is built with GCC and, thus, is able to find errors in its own code. Additionally, the GCC source code is checked by the Coverity analyzer. In general, I think that a lot of enthusiasts have also checked it with other analyzers and other tools. This makes it a hard task for PVS-Studio to find errors in GCC.

We used the trunk version from the git-repository:

git://gcc.gnu.org/git/gcc.git

(git) commit:

00a7fcca6a4657b6cf203824beda1e89f751354b svn+ssh://gcc.gnu.org/svn/gcc/trunk@238976

Note. The article publication is slightly late, and perhaps some bugs are already fixed. However, it's not a big deal: new errors constantly get in the code and the old ones disappear. The main thing is that the article shows that static analysis can help programmers to detect errors after they get into the code.

Foreseeing a discussion

As I said in the beginning, I consider GCC project to be of high quality. I am sure a lot of people will want to argue with that. As an example, I'll give a quote from Wikipedia in Russian (translated):

Some OpenBSD developers, Theo de Raadt and Otto Moerbeek criticiz GCC, saying that "gcc gets about 5-6% slower every release, has new bugs, generates crappy code, and drives us nuts".

To my mind, these statements are unjustified. Yes, perhaps, GCC code has too many macros which make reading it a bit difficult. But I can't agree with the statement about it being buggy. If GCC were buggy, nothing would work at all. Just think about the amount of programs that get successfully compiled by it and thus, work well. The creators of GCC do a great, complicated job, with professionalism. We should really thank them. I'm glad I can test the work of PVS-Studio on such a high-quality project.

For those who say that the code of Clang is still much better, I can remind you: PVS-Studio also found bugs in it: 1, 2.

PVS-Studio

I have checked the GCC code with the help of the alpha-version of PVS-Studio for Linux. We are planning to give the beta-version of the analyzer in the middle of September 2016 to those programmers who will find it useful. You may find the instruction of how to become the first person to try the Beta-version of PVS-Studio for Linux on your project in the article "PVS-Studio confesses its love for Linux"

If you are reading this article much later than September 2016, and want to try PVS-Studio for Linux, then I suggest visiting the product page.

Analysis results

We have got to the most interesting part of the article, which our regular readers are looking forward to. Let's have a look at those code fragments where the analyzer found bugs or really suspicious code.

Unfortunately, I can't give the developers the full analysis report. There is too much garbage (false alarms) at this point, because of the fact that the analyzer isn't ready to face the Linux world yet. We still have a lot of work to do regarding the reduction of the number of false positives for typical constructions. I'll try to explain using a simple example. Many diagnostics should not issue warnings for the expressions related to the assert macros. These macros are sometimes written very creatively, so we should teach the analyzer not to pay attention to them. The thing is that the assert macro can be defined in many various ways, thus we should teach PVS-Studio all the typical variants.

That's why I am asking GCC developers to wait until the Beta-version is released. I wouldn't like to spoil the impression by a report, generated by a half-finished version.

Classics (Copy-Paste)

We'll start with the most common and classical error that is detected by V501 diagnostic. Typically, these errors appear because of carelessness when Copy-Pasting the code, or are just typos during the creation of new code.

static bool
dw_val_equal_p (dw_val_node *a, dw_val_node *b)
{
  ....
  case dw_val_class_vms_delta:
    return (!strcmp (a->v.val_vms_delta.lbl1,
                     b->v.val_vms_delta.lbl1)
            && !strcmp (a->v.val_vms_delta.lbl1,
                        b->v.val_vms_delta.lbl1));
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions '!strcmp(a->v.val_vms_delta.lbl1, b->v.val_vms_delta.lbl1)' to the left and to the right of the '&&' operator. dwarf2out.c 1428

It's hard to see the errors right away, we should take a closer look here. This is why the error was not detected during code reviews and refactoring.

The function strcmp compares the same strings twice. It seems to me that we should have compared not the members of lbl1 c.lass, but of lbl2. Then the correct code could look like this:

return (!strcmp (a->v.val_vms_delta.lbl1,
                 b->v.val_vms_delta.lbl1)
        && !strcmp (a->v.val_vms_delta.lbl2,
                    b->v.val_vms_delta.lbl2));

It should be noted that the code, provided in the article, is slightly aligned, so that it doesn't take too much space on the x-axis. In fact, the code looks like this:

This error could be avoided by using "table" code alignment. For example, an error would be easier to notice if you format the code like this:

I have spoken about this approach in details in the e-book "The Ultimate Question of Programming, Refactoring, and Everything" (see chapter N13: Table-style formatting"). I recommend that everyone who cares about code quality have a look at this book.

Let's look at one more mistake, which I'm sure, appeared because of Copy-Paste:

const char *host_detect_local_cpu (int argc, const char **argv)
{
  unsigned int has_avx512vl = 0;
  unsigned int has_avx512ifma = 0;
  ....
  has_avx512dq = ebx & bit_AVX512DQ;
  has_avx512bw = ebx & bit_AVX512BW;
  has_avx512vl = ebx & bit_AVX512VL;       // <=
  has_avx512vl = ebx & bit_AVX512IFMA;     // <=
  ....
}

PVS-Studio warning: V519 The 'has_avx512vl' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 500, 501. driver-i386.c 501

Different values are written to the variable has_avx512vl twice in a row. It does not make sense. I have reviewed the code and found a variable has_avx512ifma. Most likely, it should be initialized by the expression ebx & bit_AVX512IFMA. Then the correct code should be as follows:

has_avx512vl   = ebx & bit_AVX512VL;    
has_avx512ifma = ebx & bit_AVX512IFMA;

A typo

I'll continue testing your attentiveness. Look at the code and try to find the error, without looking at the warning below.

static bool
ubsan_use_new_style_p (location_t loc)
{
  if (loc == UNKNOWN_LOCATION)
    return false;

  expanded_location xloc = expand_location (loc);
  if (xloc.file == NULL || strncmp (xloc.file, "\1", 2) == 0
      || xloc.file == '\0' || xloc.file[0] == '\xff'
      || xloc.file[1] == '\xff')
    return false;

  return true;
}

PVS-Studio warning: V528 It is odd that pointer to 'char' type is compared with the '\0' value. Probably meant: *xloc.file == '\0'. ubsan.c 1472

The programmer accidentally forgot to dereference the pointer in the expression xloc.file == '\0'. As a result the pointer is just compared with 0, i.e. with NULL. It doesn't have any effect, because such a check was already done earlier: xloc.file == NULL.

The good thing is that the programmer wrote the terminal null as '\0'. This helps us to understand quicker, that there is a bug here and how it should be fixed. I have also written about this in the book (see chapter N9: Use the '\0' literal for the terminal null character).

Correct variant of the code:

if (xloc.file == NULL || strncmp (xloc.file, "\1", 2) == 0
    || xloc.file[0] == '\0' || xloc.file[0] == '\xff'
    || xloc.file[1] == '\xff')
  return false;

Although, let's improve the code even more. I recommend formatting the expression like this:

if (   xloc.file == NULL
    || strncmp (xloc.file, "\1", 2) == 0
    || xloc.file[0] == '\0'
    || xloc.file[0] == '\xff'
    || xloc.file[1] == '\xff')
  return false;

Pay attention: now if you make the same mistake, the chance of noticing it is slightly higher:

if (   xloc.file == NULL
    || strncmp (xloc.file, "\1", 2) == 0
    || xloc.file == '\0'
    || xloc.file[0] == '\xff'
    || xloc.file[1] == '\xff')
  return false;

Potential null pointer dereference

This part could also be called "Example number one thousand, why macros are bad". I really don't like macros and always urge people to avoid using them if possible. Macros make it difficult to read the code, provoke errors, and make the work of static analyzers harder. As best I can tell, from a brief interaction with the GCC code, the authors are big fans of macros. I was really tired looking at what the macros are expanded to, and perhaps missed quite a number of interesting errors. I should confess that I was lazy at times. But still, I will demonstrate a couple of errors, connected with macros.

odr_type
get_odr_type (tree type, bool insert)
{
  ....
  odr_types[val->id] = 0;
  gcc_assert (val->derived_types.length() == 0);
  if (odr_types_ptr)
    val->id = odr_types.length ();
  ....
}

PVS-Studio warning: V595 The 'odr_types_ptr' pointer was utilized before it was verified against nullptr. Check lines: 2135, 2139. ipa-devirt.c 2135

Do you see an error here? I guess not, and the analyzer warning isn't of much help. The matter of the fact is that odr_types isn't a name of a variable, but a macro that is declared in the following way:

#define odr_types (*odr_types_ptr)

If we expand the macro and remove everything that isn't really related to the code, we'll get the following:

(*odr_types_ptr)[val->id] = 0;
if (odr_types_ptr)

First, the pointer is dereferenced and then checked. It's hard to say whether this will lead to trouble or not. It all depends on the situation, whether the pointer will really be equal to nullptr. If this situation is impossible, then this redundant check should be removed, otherwise it will mislead people supporting the code, and the code analyzer as well. If a pointer can be null, then it is a serious mistake that requires even more attention and should be fixed.

Let's consider one more similar case:

static inline bool
sd_iterator_cond (sd_iterator_def *it_ptr, dep_t *dep_ptr)
{
  ....
  it_ptr->linkp = &DEPS_LIST_FIRST (list);
  if (list)
    continue;
  ....
}

PVS-Studio warning: V595 The 'list' pointer was utilized before it was verified against nullptr. Check lines: 1627, 1629. sched-int.h 1627

We should show the macro again to see the error:

#define DEPS_LIST_FIRST(L) ((L)->first)

Let's expand the macro and get:

it_ptr->linkp = &((list)->first);
if (list)
  continue;

Some of you may say: "Hey, wait! There is no error here. We just get a pointer to the class member. There is no null pointer dereference. Yes, perhaps the code isn't really accurate, but there is no error!"

Yet, it's not as simple as it may seem. We have undefined behavior here. It's just pure luck that such code works. Actually, we cannot write like this. For example, the optimizing compiler can remove the check if (list), if it sees list->first. If we execute the -> operator, then it is assumed that the pointer isn't equal to nullptr. If it so, then we shouldn't check the pointer.

I have written a whole article on this topic: "Null Pointer Dereferencing Causes Undefined Behavior" I am examining a similar case there. Before starting any arguments, please carefully read this article.

However, this situation is really complicated, and is not really obvious. I can assume that I may be wrong and there is no error here. However, until now no one could prove it to me. It will be interesting to see the comments of GCC developers, if they read this article. They should know how the compiler works, and if this code should be interpreted as erroneous or not.

Using a destroyed array

static void
dump_hsa_symbol (FILE *f, hsa_symbol *symbol)
{
  const char *name;
  if (symbol->m_name)
    name = symbol->m_name;
  else
  {
    char buf[64];
    sprintf (buf, "__%s_%i", hsa_seg_name (symbol->m_segment),
       symbol->m_name_number);
     name = buf;
  }
  fprintf (f, "align(%u) %s_%s %s",
           hsa_byte_alignment (symbol->m_align),
           hsa_seg_name(symbol->m_segment),
           hsa_type_name(symbol->m_type & ~BRIG_TYPE_ARRAY_MASK),
           name);
  ....
}

PVS-Studio warning: V507 Pointer to local array 'buf' is stored outside the scope of this array. Such a pointer will become invalid. hsa-dump.c 704

The string is formed in the temporary buffer buf. The address of this temporary buffer is stored in the variable name, and is used further on in the body of the function. The error is that after the buffer is written to the variable name, the buffer itself will be destroyed.

We cannot use a pointer to a destroyed buffer. Formally, we are dealing with undefined behavior. In practice, this code may work quite successfully. Correct working of the program is one of the ways in which undefined behavior shows itself.

In any case, this code has an error and it must be fixed. The code can work due to the fact that the compiler may think it is unnecessary to use a temporary buffer for storing other variables and arrays further on. Then, although the array, created on the stack is considered to be destroyed, it will not be used, and the function will work correctly. But this luck can end any time, and the code that used to work for 10 years may suddenly start acting weirdly when upgrading to a new version of the compiler.

To fix this error, we should declare the buf array in the same scope as the name pointer:

static void
dump_hsa_symbol (FILE *f, hsa_symbol *symbol)
{
  const char *name;
  char buf[64];
  ....
}

Execution of similar actions regardless of the condition

The analyzer detected a code fragment that I cannot call erroneous with 100% certainty. However, it is really suspicious to do the check and then, regardless of the result, perform the same actions. Of course, it may work correctly, but this code fragment is worth revising for sure.

bool
thread_through_all_blocks (bool may_peel_loop_headers)
{
  ....
  /* Case 1, threading from outside to inside the loop
     after we'd already threaded through the header.  */
  if ((*path)[0]->e->dest->loop_father
      != path->last ()->e->src->loop_father)
  {
    delete_jump_thread_path (path);
    e->aux = NULL;
    ei_next (&ei);
  }
  else
  {
    delete_jump_thread_path (path);
    e->aux = NULL;
    ei_next (&ei);
  }
  ....
}

PVS-Studio warning: V523 The 'then' statement is equivalent to the 'else' statement. tree-ssa-threadupdate.c 2596

If this code has a bug, it's hard to say how to fix it. This is a case where it is necessary to be familiar with the project, in order to fix it.

Redundant expression of the (A == 1 || A != 2) kind

static const char *
alter_output_for_subst_insn (rtx insn, int alt)
{
  const char *insn_out, *sp ;
  char *old_out, *new_out, *cp;
  int i, j, new_len;

  insn_out = XTMPL (insn, 3);

  if (alt < 2 || *insn_out == '*' || *insn_out != '@')
    return insn_out;
  ....
}

PVS-Studio warning: V590 Consider inspecting this expression. The expression is excessive or contains a misprint. gensupport.c 1640

We are interested in the condition: (alt < 2 || *insn_out == '*' || *insn_out != '@').

It can be shortened to: (alt < 2 || *insn_out != '@').

I would venture to guess that the operator != should be replaced with ==. Then the code will make more sense:

if (alt < 2 || *insn_out == '*' || *insn_out == '@')

Zeroing a wrong pointer

Let us consider a function that frees the resources:

void
free_original_copy_tables (void)
{
  gcc_assert (original_copy_bb_pool);
  delete bb_copy;
  bb_copy = NULL;
  delete bb_original;
  bb_copy = NULL;
  delete loop_copy;
  loop_copy = NULL;
  delete original_copy_bb_pool;
  original_copy_bb_pool = NULL;
}

PVS-Studio warning: V519 The 'bb_copy' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1076, 1078. cfg.c 1078

Take a look at these 4 lines of code:

delete bb_copy;
bb_copy = NULL;
delete bb_original;
bb_copy = NULL;

Accidentally, the pointer bb_copy is zeroed twice. Here is the correct version:

delete bb_copy;
bb_copy = NULL;
delete bb_original;
bb_original = NULL;

Assert that does not check anything

Invalid condition, being an argument of the macro gcc_assert, won't affect how correctly the program works, but will make the bug search more complicated, if there is such. Let us consider the code:

static void
output_loc_operands (dw_loc_descr_ref loc, int for_eh_or_skip)
{
  unsigned long die_offset
    = get_ref_die_offset (val1->v.val_die_ref.die);
  ....
  gcc_assert (die_offset > 0
        && die_offset <= (loc->dw_loc_opc == DW_OP_call2)
             ? 0xffff
             : 0xffffffff);
  ....
}

PVS-Studio warning: V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '<=' operator. dwarf2out.c 2053

The priority of the ternary operator ?: is lower than of the <= operator and && operator. This means that we are dealing with a condition like this:

(die_offset > 0 &&
  die_offset <= (loc->dw_loc_opc == DW_OP_call2)) ?
    0xffff : 0xffffffff;

Both of values (0xffff and 0xffffffff) are true, so this expression can be simplified to:

(true)

This is clearly not what the programmer intended to get. To fix this, you should add a pair of parentheses:

gcc_assert (die_offset > 0
      && die_offset <= ((loc->dw_loc_opc == DW_OP_call2)
           ? 0xffff
           : 0xffffffff));

The ?: operator is very treacherous, and it's better not to use it in complex expressions. It is very easy to make a mistake. We have collected a large number of examples of such errors, which were found by PVS-Studio in various open source projects. I have also written in details about the ?: operator in the book that I've mentioned earlier (see chapter N4: Beware of the ?: operator and enclose it in parentheses).

Forgotten "cost"

The structure alg_hash_entry is declared in the following way:

struct alg_hash_entry {
  unsigned HOST_WIDE_INT t;
  machine_mode mode;
  enum alg_code alg;
  struct mult_cost cost;
  bool speed;
};

The programmer decided to check if in the synth_mult function there is an object that is needed. To do this he needed to compare the structure fields. However, it seems like there is an error here:

static void synth_mult (....)
{
  ....
  struct alg_hash_entry *entry_ptr;
  ....
  if (entry_ptr->t == t
      && entry_ptr->mode == mode
      && entry_ptr->mode == mode
      && entry_ptr->speed == speed
      && entry_ptr->alg != alg_unknown)
  {
  ....
}

PVS-Studio warning: V501 There are identical sub-expressions 'entry_ptr->mode == mode' to the left and to the right of the '&&' operator. expmed.c 2573

Mode is checked twice, but cost isn't checked in any way. Perhaps, one of these comparisons should be removed, but there is a chance, that we should compare cost. It's difficult for me to say, but the code should be fixed.

Duplicated assignments

To my mind, the following code fragments don't pose any hazard to the life of the program, and it seems that the duplicated assignment can just be removed.

Fragment N1

type_p
find_structure (const char *name, enum typekind kind)
{
  ....
  structures = s;                   // <=
  s->kind = kind;
  s->u.s.tag = name;
  structures = s;                   // <=
  return s;
}

PVS-Studio warning: V519 The 'structures' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 842, 845. gengtype.c 845

Fragment N2

static rtx
ix86_expand_sse_pcmpistr (....)
{
  unsigned int i, nargs;
  ....
    case V8DI_FTYPE_V8DI_V8DI_V8DI_INT_UQI:
    case V16SI_FTYPE_V16SI_V16SI_V16SI_INT_UHI:
    case V2DF_FTYPE_V2DF_V2DF_V2DI_INT_UQI:
    case V4SF_FTYPE_V4SF_V4SF_V4SI_INT_UQI:
    case V8SF_FTYPE_V8SF_V8SF_V8SI_INT_UQI:
    case V8SI_FTYPE_V8SI_V8SI_V8SI_INT_UQI:
    case V4DF_FTYPE_V4DF_V4DF_V4DI_INT_UQI:
    case V4DI_FTYPE_V4DI_V4DI_V4DI_INT_UQI:
    case V4SI_FTYPE_V4SI_V4SI_V4SI_INT_UQI:
    case V2DI_FTYPE_V2DI_V2DI_V2DI_INT_UQI:
      nargs = 5;         // <=
      nargs = 5;         // <=
      mask_pos = 1;
      nargs_constant = 1;
      break;
  ....
}

PVS-Studio warning: V519 The 'nargs' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 39951, 39952. i386.c 39952

Fragment N3

The latter fragment seems to be stranger than the other ones. Perhaps there is some mistake here. The variable steptype is assigned with a value 2 or 3 times. It is very suspicious.

static void
cand_value_at (....)
{
  aff_tree step, delta, nit;
  struct iv *iv = cand->iv;
  tree type = TREE_TYPE (iv->base);
  tree steptype = type;                 // <=
  if (POINTER_TYPE_P (type))
    steptype = sizetype;                // <=
  steptype = unsigned_type_for (type);  // <=
  ....
}

PVS-Studio warning: V519 The 'steptype' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 5173, 5174. tree-ssa-loop-ivopts.c 5174

Conclusion

I'm glad I could write this article. Now I will have something to say in response to the comments like "PVS-Studio isn't necessary, because GCC issues the same warnings". As you can see, PVS-Studio is a very powerful tool and excels GCC in the diagnostic capabilities. I do not deny that the GCC has excellent diagnostics. This compiler, if properly set up, really brings out a lot of problems in the code. But PVS-Studio is a specialized and rapidly developing tool, which means that it will also be better at detecting errors in the code than the compilers.

I suggest taking a look at the analysis of other open-source projects, and visiting this section of our web-site. Also, those who use Twitter can follow me @Code_Analysis. I regularly post links to interesting articles about programming on C and C++, and also talk about the achievements of our analyzer.

#Cpp #Compiler