>
>
>
A Boring Article About a Check of the O…

Andrey Karpov
Articles: 674

A Boring Article About a Check of the OpenSSL Project

Some time ago, a vulnerability was revealed in OpenSSL, and I guess there's no programmer who hasn't been talking about it since then. I knew that PVS-Studio could not catch the bug leading to this particular vulnerability, so I saw no reason for writing about OpenSSL. Besides, quite a lot of articles have been published on the subject recently. However, I received a pile of e-mails, people wanting to know if PVS-Studio could detect that bug. So I had to give in and write this article.

Checking OpenSSL

I guess everyone knows by now about a serious vulnerability found in OpenSSL. But in case you have missed the news for some reason and want to find out more on the subject, see the following links:

To put it short, the vulnerability that could be exploited to access users' private data has existed for about 2 years. During all this time, it has been staying unnoticed by all code analyzers, though everyone probably tested the library more than once.

So did we. Here's a post about that check: "A few words about OpenSSL". We found a few bugs in the code, but none were too serious. The authors fixed them after that, so our check had not been in vain.

I haven't investigated if the Heartbleed bug was already there when we were checking OpenSSL. But anyway, I know for sure that PVS-Studio cannot detect such bugs. It's that they are just hard to detect in themselves. The OpenSSL project has been analyzed with many different tools, and none of them ever noticed the bug. For example, Coverity Scan, a leader among code analyzers, failed too. Here's a posts about that: "Heartbleed and Static Analysis", "Heartbleed and static analysis (2)".

The reason is that bugs of this kind are very difficult to diagnose with the means of static analysis: the code is too complicated, and the analyzer needs to take into account the values stored in memory, figure out what is hidden behind explicit type conversions, and so on. Even a human can't easily figure out what the error is about; and static analyzers give up immediately. It's not a flaw of the static analysis methodology though - it's just that the error is really complicated. There's probably no tool that could catch such a bug without preliminary training.

Note that there are also static analysis tools, both popular and unknown, designed specifically for detecting vulnerabilities. Perhaps they could detect the Heartbleed, but I strongly doubt it. If they had, the authors would have made use of the fact for advertisement. Of course, there is also a version that such a tool does exist, developed by some intelligence services who will never tell us anything. But it looks too much like a conspiracy theory, and I think we'd better not go on with it.

My personal opinion is, it is just an error, not a tab (backdoor). Static analysis tools cannot detect it because it is very complicated. That's it.

I could have finished with the article here, but you would have found it too boring then. So I decided to check OpenSSL with PVS-Studio once again. I haven't found anything of interest, but let's have a look at what we've got, anyway.

Why are there so few bugs? Because OpenSSL is a high-quality project. A serious vulnerability caught in it doesn't mean the code is awful. I suspect many projects have much more serious security holes but they are not of much importance to anyone. Besides, the OpenSSL project is regularly checked by various tools.

Analysis results

I repeat it once again: I haven't found any serious bugs. So you'd better treat the text below as comments on untidy code rather than error descriptions. I just don't want you to leave comments blaming me for making a big deal of trifles.

Suspicious comparison

typedef struct ok_struct
{
  ....
  size_t buf_len_save;
  size_t buf_off_save;
  ....
} BIO_OK_CTX;

static int ok_read(BIO *b, char *out, int outl)
{ 
  .... 
  BIO_OK_CTX *ctx;
  ....
  /* copy start of the next block into proper place */
  if(ctx->buf_len_save - ctx->buf_off_save > 0)
  ....
}

PVS-Studio's diagnostic message: V555 The expression of the 'A - B > 0' kind will work as 'A != B'. bio_ok.c 243

The (ctx->buf_len_save - ctx->buf_off_save > 0) expression works in a different way than it seems at first.

It looks like the programmer wanting to check the (ctx->buf_len_save > ctx->buf_off_save) condition here. It's not so. You see, the variables being compared are unsigned. Subtracting an unsigned variable from another unsigned variable gives an unsigned value.

The (ctx->buf_len_save - ctx->buf_off_save > 0) condition will be true whenever the variables are not equal. In other words, the following two expressions are equivalent:

  • (ctx->buf_len_save - ctx->buf_off_save > 0)
  • (ctx->buf_len_save != ctx->buf_off_save)

A note for those not well familiar with the C language. Experienced developers may skip the text blow.

Assume we have two 32-bit unsigned variables:

unsigned A = 10;

unsigned B = 20;

Let's check if the (A - B > 0) condition will be true.

The subtraction (A - B) evaluates to 10u - 20u = 0xFFFFFFF6u = 4294967286u.

Now we compare the unsigned number 4294967286u to zero. Zero is cast to the unsigned type too, but it doesn't matter.

The (4294967286u > 0u) expression evaluates to true.

That is, the (A - B > 0) condition will be false in one case only - when A == B.

Is it an error? I can't say for sure as I'm not familiar with the project design, but I think it's not.

It's more likely that we are dealing with the following logic. The 'buf_len_save' variable is usually larger than the 'buf_off_save' variable, and only in rare cases may they be equal. And it is for these rare cases that the check was implemented. The case when (buf_len_save < buf_off_save) is probably impossible.

A harmless uninitialized variable

There is a fragment in the code where an uninitialized variable may be used. It won't lead to any bad consequences though. Here is this code:

int PEM_do_header(....)
{
  int i,j,o,klen;
  ....
  if (o)
    o = EVP_DecryptUpdate(&ctx,data,&i,data,j);
  if (o)
    o = EVP_DecryptFinal_ex(&ctx,&(data[i]),&j);
  ....
  j+=i;
  if (!o)
  {
    PEMerr(PEM_F_PEM_DO_HEADER,PEM_R_BAD_DECRYPT);
    return(0);
  }
  ....  
}

PVS-Studio's diagnostic message: V614 Potentially uninitialized variable 'i' used. pem_lib.c 480

The 'i' variable may appear uninitialized if (o == false). It will result in adding god knows what to 'j'. But there is nothing to worry about because when (o == false), an error handler is called and the function terminates.

The code is correct but untidy. It's better to check the 'o' variable first and only then use 'i':

if (!o)
{
  PEMerr(PEM_F_PEM_DO_HEADER,PEM_R_BAD_DECRYPT);
  return(0);
}
j+=i;

Strange assignments

#define SSL_TLSEXT_ERR_ALERT_FATAL 2
int ssl3_accept(SSL *s)
{
  ....
  if (ret != SSL_ERROR_NONE)
  {
    ssl3_send_alert(s,SSL3_AL_FATAL,al);  
    if (al != TLS1_AD_UNKNOWN_PSK_IDENTITY)   
      SSLerr(SSL_F_SSL3_ACCEPT,SSL_R_CLIENTHELLO_TLSEXT);      
    ret = SSL_TLSEXT_ERR_ALERT_FATAL;      
    ret= -1;
    goto end;  
  }
  ....
}

PVS-Studio's diagnostic message: V519 The 'ret' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 376, 377. s3_srvr.c 377

At first, the 'ret' variable is assigned value 2, then value -1. I suspect the first assignment is unnecessary and was left in the code by mistake.

Another case:

int
dtls1_retransmit_message(....)
{
  ....
  /* save current state */
  saved_state.enc_write_ctx = s->enc_write_ctx;
  saved_state.write_hash = s->write_hash;
  saved_state.compress = s->compress;
  saved_state.session = s->session;
  saved_state.epoch = s->d1->w_epoch;
  saved_state.epoch = s->d1->w_epoch;
  ....
}

PVS-Studio's diagnostic message: V519 The 'saved_state.epoch' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1277, 1278. d1_both.c 1278

Potential null pointer dereferencing

Dereferencing a null pointer before checking it is the most common mistake in programs (judging by my experience). It's not always an error as there are many cases when the pointer just can't be null. However, such code is potentially dangerous, especially if the project is rapidly changing.

OpenSSL also has such mistakes:

int SSL_shutdown(SSL *s)
{
  if (s->handshake_func == 0)
  {
    SSLerr(SSL_F_SSL_SHUTDOWN, SSL_R_UNINITIALIZED);
    return -1;
  }

  if ((s != NULL) && !SSL_in_init(s))
    return(s->method->ssl_shutdown(s));
  else
    return(1);
  }
  ....
}

PVS-Studio's diagnostic message: V595 The 's' pointer was utilized before it was verified against nullptr. Check lines: 1013, 1019. ssl_lib.c 1013

The 's' pointer is first used: (s->handshake_func == 0),

and only then checked: (s != NULL).

Here's another, more complicated, case:

#define bn_wexpand(a,words) \
  (((words) <= (a)->dmax)?(a):bn_expand2((a),(words)))

static int ubsec_dh_generate_key(DH *dh)
{
  ....
  if(bn_wexpand(pub_key, dh->p->top) == NULL) goto err;
  if(pub_key == NULL) goto err;
  ....
}

PVS-Studio's diagnostic message: V595 The 'pub_key' pointer was utilized before it was verified against nullptr. Check lines: 951, 952. e_ubsec.c 951

To notice the error, we need to expand the macros. After that, we'll get the following code:

if((((dh->p->top) <= (pub_key)->dmax)?
    (pub_key):bn_expand2((pub_key),
    (dh->p->top))) == ((void *)0)) goto err;
if(pub_key == ((void *)0)) goto err;

Notice the pointer 'pub_key'.

It is first dereferenced: (pub_key)->dmax.

Then it is checked for being null: (pub_key == ((void *)0)).

Unnecessary checks

There are several code fragments where a variable is compared twice to one and the same value. I don't think them to be errors; it's just that the second check was written by mistake and can be removed.

Unnecessary check No.1

int ASN1_PRINTABLE_type(const unsigned char *s, int len)
{
  ....
  if (!(  ((c >= 'a') && (c <= 'z')) ||
      ((c >= 'A') && (c <= 'Z')) ||
      (c == ' ') ||                       <<<<====
      ((c >= '0') && (c <= '9')) ||
      (c == ' ') || (c == '\'') ||        <<<<====
      (c == '(') || (c == ')') ||
      (c == '+') || (c == ',') ||
      (c == '-') || (c == '.') ||
      (c == '/') || (c == ':') ||
      (c == '=') || (c == '?')))
      ia5=1;
  ....
}

PVS-Studio's diagnostic message: V501 There are identical sub-expressions '(c == ' ')' to the left and to the right of the '||' operator. a_print.c 76

I marked the identical checks with "<<<<====". I mentioned this duplicated check in the previous article, but it is still unfixed, which means it's surely not a defect.

Unnecessary checks No.2, No.3

int ssl3_read_bytes(SSL *s, int type,
  unsigned char *buf, int len, int peek)
{
  ....
  if ((type && (type != SSL3_RT_APPLICATION_DATA) &&
       (type != SSL3_RT_HANDSHAKE) && type) ||
      (peek && (type != SSL3_RT_APPLICATION_DATA)))
  ....
}

PVS-Studio's diagnostic message: V501 There are identical sub-expressions 'type' to the left and to the right of the '&&' operator. s3_pkt.c 952

The 'type' variable is checked twice for being non-null.

This code fragment was copied into another file, so that file also includes this comparison: d1_pkt.c 760.

Incorrect string lengths

It's not a good idea to use magic constants to specify string lengths because you may easily make a mistake. The PVS-Studio analyzer has found three fragments of this kind in OpenSSL.

The first unfortunate magic number

To prove that this is an error, let's examine a few examples of the BIO_write function's calls:

  • BIO_write(bp,"Error in encoding\n",18)
  • BIO_write(bp,"\n",1)
  • BIO_write(bp,":",1)
  • BIO_write(bp,":BAD OBJECT",11)
  • BIO_write(bp,"Bad boolean\n",12)

As you can see from these examples, the last number specifies the string length.

And here's an incorrect code now:

static int asn1_parse2(....)
{
  ....
  if (BIO_write(bp,"BAD ENUMERATED",11) <= 0)
    goto end;
  ....
}

PVS-Studio's diagnostic message: V666 Consider inspecting third argument of the function 'BIO_write'. It is possible that the value does not correspond with the length of a string which was passed with the second argument. asn1_par.c 378

The length of the "BAD ENUMERATED" string is 11 characters, not 14.

The second unfortunate magic number

static int www_body(char *hostname, int s, unsigned char *context)
{
  ....
  if ( ((www == 1) && (strncmp("GET ",buf,4) == 0)) ||
       ((www == 2) && (strncmp("GET /stats ",buf,10) == 0)))
  ....
}

PVS-Studio's diagnostic message: V666 Consider inspecting third argument of the function 'strncmp'. It is possible that the value does not correspond with the length of a string which was passed with the first argument. s_server.c 2703

The length of the "GET /stats " string is 10 characters, not 11. The last space is forgotten. It's a small defect, but it is still a defect.

The third unfortunate magic number

static int asn1_cb(const char *elem, int len, void *bitstr)
{
  ....
  if (!strncmp(vstart, "ASCII", 5))
    arg->format = ASN1_GEN_FORMAT_ASCII;
  else if (!strncmp(vstart, "UTF8", 4))
    arg->format = ASN1_GEN_FORMAT_UTF8;
  else if (!strncmp(vstart, "HEX", 3))
    arg->format = ASN1_GEN_FORMAT_HEX;
  else if (!strncmp(vstart, "BITLIST", 3))
    arg->format = ASN1_GEN_FORMAT_BITLIST;
  else
  ....
}

PVS-Studio's diagnostic message: V666 Consider inspecting third argument of the function 'strncmp'. It is possible that the value does not correspond with the length of a string which was passed with the second argument. asn1_gen.c 371

The problem is in the following line:

if (!strncmp(vstart, "BITLIST", 3))

The length of the "BITLIST" string is 7 characters.

Let me take a step away from the subject for a while. Some readers may wonder how PVS-Studio diagnoses errors of this kind. Let me explain. The analyzer collects information about function calls (in this particular case - about the calls of the strncmp() function) and draws a data matrix:

  • vstart, "ASCII", 5
  • vstart, "UTF8", 4
  • vstart, "HEX", 3
  • vstart, "BITLIST", 3

The function has a string argument and a numerical one. The string length coincides with the number in most cases, therefore the number is used to specify the string length. But these arguments are different in one call, so the analyzer generates the V666 warning.

Not a good idea

It's not a good idea to use "%08lX" to print a pointer value; it's better to use "%p" designed specifically for this purpose.

typedef struct mem_st
{
  void *addr;
  ....
} MEM;

static void print_leak_doall_arg(const MEM *m, MEM_LEAK *l)
{
  ....
  BIO_snprintf(bufp, BUF_REMAIN, "number=%d, address=%08lX\n",
               m->num,(unsigned long)m->addr);
  ....
}

It's not a pointer that is passed into the function, but a type value (unsigned long). That's why the compiler and some analyzers will keep silent about that.

PVS-Studio, however, detected this defect in an indirect way. It didn't like the pointer being explicitly cast to unsigned long; doing so is incorrect as nobody can guarantee that the pointer will fit into the 'long' type. For example, you can't do that in Win64.

The fixed, and shorter, code looks as follows:

BIO_snprintf(bufp, BUF_REMAIN, "number=%d, address=%p\n",
  m->num, m->addr);

There are three fragments where a pointer value is printed incorrectly:

  • mem_dbg.c 699
  • bio_cb.c 78
  • asn1_lib.c 467

Conclusion

Although static analyzers didn't reveal the error we were talking about in the beginning and it has successfully survived in the code for a long time, I still strongly recommend that every programmer use static analysis in their everyday work. Just don't try to find a silver bullet to kill all the problems at one shot and clear your code of every single bug. The best result can only be achieved with a comprehensive approach - combining unit-tests, static and dynamic analysis, regression tests, etc. Static analysis, in particular, will help you find and fix numbers of typos and silly mistakes at the coding stage and thus save time on other useful things like implementing a new functionality or writing more meticulous tests.

Welcome to try our code analyzer PVS-Studio.