>
>
>
Analysis of Haiku Operating System (BeO…

Sviatoslav Razmyslov
Articles: 90

Analysis of Haiku Operating System (BeOS Family) by PVS-Studio. Part 2

This is the second and last part of the large article about analysis of the Haiku operating system. In the first article, we discussed a variety of possible errors all of which one way or another deal with conditions. In this article, we will discuss the remaining analyzer warnings I have selected for you. The bug examples are grouped into several categories.

Introduction

Haiku is a free and open-source operating system for PC designed to be binary compatible with the BeOS operating system and embodying the basic ideas of BeOS. It's a modular system with the hybrid-kernel architecture - microkernel architecture capable of dynamical module linking.

The project was analyzed on the Haiku user community's request with the PVS-Studio 5.24 static analyzer.

String handling

V527 It is odd that the '\0' value is assigned to 'char' type pointer. Probably meant: *scratchPtr = '\0'. TextGapBuffer.cpp 228

const char*
TextGapBuffer::Text()
{
  const char* realText = RealText();

  if (fPasswordMode) {
    ....

    char* scratchPtr = fScratchBuffer;
    for (uint32 i = 0; i < numChars; i++) {
      memcpy(scratchPtr, B_UTF8_BULLET, bulletCharLen);
      scratchPtr += bulletCharLen;
    }
    scratchPtr = '\0';      // <=

    return fScratchBuffer;
  }

  return realText;
}

After handling the string, the programmer most likely wanted to add a terminal null character to its end instead of zeroing the pointer. The correct version of this code is as follows: "*scratchPtr = '\0';".

V692 An inappropriate attempt to append a null character to a string. To determine the length of a string by 'strlen' function correctly, a string ending with a null terminator should be used in the first place. PoorManWindow.cpp 254

void
PoorManWindow::MessageReceived(BMessage* message)
{
  ....
  if (inet_ntop(AF_INET, &sin_addr, addr, sizeof(addr)) != NULL){
    addr[strlen(addr)] = '\0';  // <=
    line << '(' << addr << ") ";
  }
  ....
}

To write the terminal null character at the end of the string, the programmer used the strlen() function in this code, but the result of this is unpredictable, for the string must already be null-terminated for the strlen() function to work properly. It is that very cell where 0 is found that the new zero will be written into. At the same time, the strlen() function can reach far beyond the buffer's bounds, which will cause an undefined-behavior issue. To fix this code, we need to use some different means to calculate the string length.

Bad loops

V529 Odd semicolon ';' after 'for' operator. ringqueue.cpp 39

int
compute_order(unsigned long size)
{
  int  order;
  unsigned long tmp;
  for (order = 0, tmp = size; tmp >>= 1; ++order); // <=
    if (size & ~(1 << order))
      ++order;
    return order;
}

Something is wrong with this function - a loop left without its body because of a semicolon at the end. Code formatting suggests that the condition should be included into the loop body. On the other hand, the 'tmp' variable still won't be used anywhere.

Perhaps what the programmer wanted to do is the following:

int
compute_order(unsigned long size)
{
  int  order;
  unsigned long tmp;
  for (order = 0, tmp = size; tmp >>= 1; ++order)
    if (tmp & ~(1 << order))
      ++order;
  return order;
}

However, changing the counter of a for(;;) loop inside the body is not a very good style.

V535 The variable 'k' is being used for this loop and for the outer loop. Check lines: 3598, 3610. rules.c 3610

void
solver_get_unneeded(Solver *solv, Queue *unneededq, int filtered)
{
  ....
  if (dep_possible(solv, *dp, &installedm))
  {
    Queue iq;
    Id iqbuf[16];
    queue_init_buffer(&iq, iqbuf, sizeof(iqbuf)/sizeof(*iqbuf));
    dep_pkgcheck(solv, *dp, 0, &iq);
    for (k = 0; k < iq.count; k++)            // <=
      {
  Id p = iq.elements[k];
  Solvable *sp = pool->solvables + p;
  if (....)
    continue;
  for (j = 0; j < count; j++)
    if (p == unneededq->elements[j])
      break;
  /* now add edge from j + 1 to i + 1 */
  queue_insert(....);
  /* addapt following edge pointers */
  for (k = j + 2; k < count + 2; k++)         // <=
    edges.elements[k]++;
      }
    queue_free(&iq);
  }
  ....
}

The code formatting is so terrible that if there is any error at all here, it surely has been made due to the formatting. It is a bad style to use one counter in nested for(;;) loops.

Another issue of this kind:

  • V535 The variable 'i' is being used for this loop and for the outer loop. Check lines: 2319, 2349. solver.c 2349

V634 The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression. RAW.cpp 1141

void
DCRaw::_WaveletDenoise()
{
  ....
  for (i = 0; i < (1 << dim * 2); i++) {  // <=
    if (fimg[i] < -fThreshold)
      fimg[i] += fThreshold;
    else if (fimg[i] > fThreshold)
      fimg[i] -= fThreshold;
    else
      fimg[i] = 0;
  }
  ....
}

The multiplication operation has a higher precedence than the shift operation. I don't know what exactly the code's authors wanted to do here, so they need to check the operator sequence and put parentheses to explicitly define the operation execution order and make it more transparent.

Another similar issue:

  • V634 The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression. RAW.cpp 1099

V696 The 'continue' operator will terminate 'do { ... } while (FALSE)' loop because the condition is always false. Check lines: 1939, 1945. Roster.cpp 1939

status_t
BRoster::_LaunchApp(....) const
{
  ....
  do {
    // find the app
    ....
    if (appType.InitCheck() == B_OK
      && appType.GetAppHint(&hintRef) == B_OK
      && appRef == hintRef) {
      appType.SetAppHint(NULL);
      // try again
      continue;
    }
    ...
  } while (false);
  ....
}

The 'continue' operator in the "do { ... } while( ... )" loop makes a transition to calculating the loop termination condition, but it is always false - in fact it is unconditional loop termination and the "try again" comment will only confuse anyone who reads this code in future.

V706 Suspicious division: sizeof (kBaudrates) / sizeof (char *). Size of every element in 'kBaudrates' array does not equal to divisor. SerialWindow.cpp 162

const int SerialWindow::kBaudrates[] = { 50, 75, 110, .... };

SerialWindow::SerialWindow() : ....
{
  ....
  for(int i = sizeof(kBaudrates) / sizeof(char*); --i >= 0;)// <=
  {
    message = new BMessage(kMsgSettings);
    message->AddInt32("baudrate", kBaudrateConstants[i]);

    char buffer[7];
    sprintf(buffer, "%d", kBaudrates[i]);                   // <=
    BMenuItem* item = new BMenuItem(buffer, message);

    fBaudrateMenu->AddItem(item);
  }
  ....
}

To find out the number of items in the 'kBaudrates' array, the programmer for some reason divides its size by the pointer size, so it turns out that in the 32-bit version, the entire array will be indexed, while in the 64-bit one, only half of it.

Arrays

V548 Consider reviewing type casting. TYPE X[][] in not equivalent to TYPE **X. RAW.cpp 1668

void
DCRaw::_AdobeCoefficients(const char *make, const char *model)
{
  static const struct {
    const char *prefix;
    short black, trans[12];
  } table[] = {
    { "Canon EOS D2000", 0,
      { 24542,-10860,-3401,-1490,11370,-297,2858,-605,3225 }},
    { "Canon EOS D6000", 0,
      { 20482,-7172,-3125,-1033,10410,-285,2542,226,3136 }},
    ....
  };
  double cameraXYZ[4][3];

  for (uint32 i = 0; i < sizeof table / sizeof *table; i++) {
    if (!strncasecmp(model, table[i].prefix, strlen(....))) {
      if (table[i].black)
        fMeta.black = table[i].black;
      for (uint32 j = 0; j < 12; j++) {
        ((double**)cameraXYZ)[0][j] = table[i].trans[j] /10000.0;
      }
      _CameraXYZCoefficients(cameraXYZ);
      break;
    }
  }
}

The 'cameraXYZ' array declared as "double cameraXYZ[4][3]" is cast to the "double **" type. This type conversion, I guess, makes no sense at all and can be a source of some bugs.

The types "type[a][b]" and "type **" are different data structures. Type[a][b] is a single memory area that can be handled as a two-dimensional array, while type ** is an array of pointers to some memory areas.

V554 Incorrect use of auto_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. DefaultCatalog.cpp 208

status_t
DefaultCatalog::ReadFromFile(const char *path)
{
  ....
  auto_ptr<char> buf(new(std::nothrow) char [sz]);
  ....
}

The analyzer has detected an issue when using a smart pointer may cause undefined behavior. The 'auto_ptr' class is not intended for array handling: it uses the 'delete' operator to free memory and if you specify 'delete[]', the code simply won't compile.

The fixed code:

status_t
DefaultCatalog::ReadFromFile(const char *path)
{
  ....
  unique_ptr<char[]> buf(new(std::nothrow) char[sz]);
  ....
}

Another issue of this kind:

  • V554 Incorrect use of auto_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. DefaultCatalog.cpp 249

V557 Array overrun is possible. The '8' index is pointing beyond array bound. floppy_ctrl.c 637

V557 Array overrun is possible. The '9' index is pointing beyond array bound. floppy_ctrl.c 638

typedef struct floppy {
  ....
  uint8 result[8]; /* status of the last finished command */
  ....
};

void
floppy_dump_reg(floppy_t *flp) {
  ....
  //uint8 result[10];           // <= This was correct!
  uint8 *result = flp->result;  // <= Bad fix! :)
  ....
  dprintf(FLO "gap=%d wg=%d eis=%d fifo=%d poll=%d thresh=%d
    pretrk=%d\n", 
    (result[7] & 0x02) >> 1, result[7] & 0x01,
    (result[8] & 0x40) >> 6, 
    (result[8] & 0x20) >> 5, (result[8] & 0x10) >> 4,
     result[8] & 0x0f, result[9]);
  ....
}

Two analyzer warnings refer to array overruns. The comments suggest that the 'result[]' array used to comprise 10 items in the past and after modification, its size was reduced to 8 items. At the same time, the program still tries to address ten items, with indexes from 0 to 9.

Variable names

V672 There is probably no need in creating the new 'path' variable here. One of the function's arguments possesses the same name and this argument is a reference. Check lines: 348, 429. translate.cpp 429

status_t
Translator::FindPath(const translation_format *format,
  BPositionIO &stream, TypeList &typesSeen, TypeList &path, ....)
{
  ....
  TypeList path;
  double quality;
  if (FindPath(...) == B_OK) {
    if (bestQuality < quality * formatQuality) {
      bestQuality = quality * formatQuality;
      bestPath.SetTo(path);
      bestPath.Add(formats[j].type);
      status = B_OK;
    }
  }
  ....
}

Coincidence of the local 'path' variable's name with the function parameter (and not just function parameter but a reference like in this case) may cause a loss of local changes in this variable as well as other logical errors.

V711 It is dangerous to create a local variable within a loop with a same name as a variable controlling this loop. ipv4.cpp 514

static int
dump_ipv4_multicast(int argc, char** argv)
{
  MulticastState::Iterator it = sMulticastState->GetIterator();

  while (it.HasNext()) {
    ....
    int count = 0;
    IPv4GroupInterface::AddressSet::Iterator it
      = state->Sources().GetIterator();
    while (it.HasNext()) {
      ....
    }

    kprintf("}> sock %p\n", state->Parent()->Socket());
  }

  return 0;
}

A declaration of the 'it' variable was detected in the loop body, its name coinciding with that of a variable used as a loop counter. This code may contain certain logical errors, to the extent that you may end up with an infinite loop.

Memory handling

V597 The compiler could delete the 'memset' function call, which is used to flush 'password' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. login.cpp 126

static status_t
login(const char* user, struct passwd** _passwd)
{
  ....
  bool ok = verify_password(passwd, spwd, password);
  memset(password, 0, sizeof(password));
  
  if (!ok)
    return B_PERMISSION_DENIED;

  *_passwd = passwd;
  return B_OK;
}

Unfortunately, the password may remain uncleared in this code. Note that after the 'password' array is cleared at the end, it is not used anymore. Therefore, when building the release version, the compiler is very likely to remove the call of the memset() function. It has the full authority to do so. The analyzer suggests using an analogous function for Windows, but in the Haiku operating system, we need to find some other means to avoid the harmful compiler-driven optimization.

Other dangerous issues of this kind:

  • V597 The compiler could delete the 'memset' function call, which is used to flush 'finalcount' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. sha1.c 228
  • V597 The compiler could delete the 'memset' function call, which is used to flush 'encoded_block' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. dst_api.c 446
  • V597 The compiler could delete the 'memset' function call, which is used to flush 'in_buff' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. dst_api.c 916
  • V597 The compiler could delete the 'memset' function call, which is used to flush 'repeatedPassword' buffer. The RtlSecureZeroMemory() function should be used to erase the private data. passwd.cpp 171

V630 The 'malloc' function is used to allocate memory for an array of objects which are classes containing constructors. PDFWriter.cpp 117

status_t
PDFWriter::PrintPage(int32  pageNumber, int32 pageCount)
{
  ....
  pictures =
    (BPicture **)malloc(pictureCount * sizeof(BPicture *));
  picRects =
    (BRect *)malloc(pictureCount * sizeof(BRect));    // <=
  picPoints =
    (BPoint *)malloc(pictureCount * sizeof(BPoint));  // <=
  picRegion = new BRegion();
  ....
}

When using malloc to allocate memory for an array of objects of some class, neither a constructor is called when creating an object, nor a destructor is called when destroying it. Code like this may result in handing uninitialized variables and other issues.

V512 A call of the 'memset' function will lead to underflow of the buffer 'context'. sha2.c 623

#define MEMSET_BZERO(p,l)  memset((p), 0, (l))

void solv_SHA256_Final(sha2_byte digest[], SHA256_CTX* context) {
  ....
  /* Clean up state data: */
  MEMSET_BZERO(context, sizeof(context));
  usedspace = 0;
}

The size of the memory area to be cleared equals the pointer size, not the structure size.

Other issues of this kind:

  • V512 A call of the 'memset' function will lead to underflow of the buffer 'context'. sha2.c 644
  • V512 A call of the 'memset' function will lead to underflow of the buffer 'context'. sha2.c 953
  • V512 A call of the 'memset' function will lead to underflow of the buffer 'context'. sha2.c 973
  • V512 A call of the 'memset' function will lead to underflow of the buffer 'context'. sha2.c 1028
  • V512 A call of the 'memset' function will lead to underflow of the buffer 'context'. sha2.c 1048

Miscellaneous

V591 Non-void function should return a value. pc.c 1031

ULONG
set_var(char *name, ULONG val)
{
  variable *v;

  v = lookup_var(name);
  if (v != NULL)
    v->value = val;
  else
    add_var(name, val);
}

Most likely, the returned value is not used in any way when calling the set_var() function. But if anyone does use it someday, the result will be undefined behavior.

V671 It is possible that the 'swap' function interchanges the 'std::declval < _Alloc & > ()' variable with itself. alloc_traits.h 191

static constexpr bool _S_nothrow_swap()
{
  using std::swap;
  return !_S_propagate_on_swap()
    || noexcept(
         swap(std::declval<_Alloc&>(), std::declval<_Alloc&>()));
}

Strange use of the swap() function: identical arguments.

V519 The 'data->error' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 222, 223. repo_solv.c 223

static unsigned char *
data_read_idarray(.... , Repodata *data)
{
  ....
  data->error = pool_error(            // <=
    data->repo->pool, SOLV_ERROR_ID_RANGE,
    "data_read_idarray: id too large (%u/%u)", x, max);
  data->error = SOLV_ERROR_ID_RANGE;   // <=
  ....
}

Assigning different values to one and the same variable on end. Looks like a typo.

V568 It's odd that the argument of sizeof() operator is the 'sizeof (struct tlv_header_t)' expression. print-slow.c 255

void
slow_print(register const u_char *pptr, register u_int len) {
  ....
  if (vflag > 1)
    print_unknown_data(tptr+sizeof(sizeof(struct tlv_header_t)),
      "\n\t  ", tlv_len-sizeof(struct tlv_header_t));
  ....
}

The argument of the sizeof() operator is sizeof(), too. This operator calculates the type of an expression and returns the size of this type, while the expression itself is not evaluated, i.e. the structure size doesn't affect anything in this code.

There are lots of fragments like that:

  • V568 It's odd that the argument of sizeof() operator is the 'sizeof (struct lmp_object_header)' expression. print-lmp.c 872
  • V568 It's odd that the argument of sizeof() operator is the 'sizeof (struct tlv_header_t)' expression. print-slow.c 182
  • V568 It's odd that the argument of sizeof() operator is the 'sizeof (struct eigrp_tlv_header)' expression. print-eigrp.c 283
  • V568 It's odd that the argument of sizeof() operator is the 'sizeof (struct eigrp_tlv_header)' expression. print-eigrp.c 471

Conclusion

Haiku is a large and unusual project. I enjoyed analyzing it and making my small contribution to its development. Despite my pretty rich experience of working with open-source projects, I still was encountering some rare warnings when checking this project. In this article, I discussed what I think to be the most suspicious and interesting code samples. All the other fragments I haven't mentioned here or simply overlooked can be investigated by the authors themselves - we will send them a complete analysis log.