>
>
>
PVS-Studio checks the code of Flipper Z…

Andrey Karpov
Articles: 673

PVS-Studio checks the code of Flipper Zero dolphin

Flipper Zero is an open-source multi-tool for geeks and penetration testers. It so happened that the Flipper Zero project and the PVS-Studio analyzer crossed paths. A philosophical question: should we check the project, if the project developers have already started fixing errors? Let's try to do this.

What is Flipper Zero?

I asked the Flipper Zero developers to take part in writing the article. They left different notes and comments in the document where I described the project check. So, this article differs from our usual articles about checking open-source projects.

Flipper Zero is a portable multi-tool used to explore access control systems: on-door speakerphones, radio remote control systems, barrier gates, TVs, contactless cards. It is built on the STM32WB55 microcontroller. The project is open source, it is licensed under GPL. Although, I won't try to describe Flipper Zero in my own words. Great that the developers will tell our readers about their amazing project. So, I give them the floor.

The Flipper firmware is mostly written in C with a few things written in C++.

We wanted Flipper to be available for every language. That's why, we implemented the code in C at low and medium levels (HAL and the system kernel, the functions that applications use). This allows binding these APIs to any language that supports C ABI (literally any language).

Besides, we wrote several applications (this is considered the top level in Flipper) in C++, because this speeds up the code writing. And the speed was crucial for these applications.

Many Flipper Zero developers read our articles. Some of our team members are interested in the project's fate and development. No wonder the moment came when we crossed our paths and started the discussions.

The Flipper Zero team suggested checking their project with the PVS-Studio analyzer. So, why not? Moreover, one of my coworkers said: "These guys are super cool!". Well then, we must check Flipper Zero! :)

My colleague skimmed through the project and said: "Seems that the project has a few errors. However, there's something worthy to discuss". Nice! We are always happy to check an exciting project. This gives us an opportunity to show the analyzer in action, while the developers improve the project quality.

To write or not to write?

One of the suspicious cases that was hastily noted:

if(....) { .... }
else
{
  memcpy(subghz->file_name_tmp, subghz->file_name, strlen(subghz->file_name));
  if(scene_manager_get_scene_state(....) == SubghzCustomEventManagerSet) {
    subghz_get_next_name_file(subghz);
  }
}

The PVS-Studio warning: V575 The 'memcpy' function doesn't copy the whole string. Use 'strcpy / strcpy_s' function to preserve terminal null. subghz_scene_save_name.c 22

Soon you'll understand why I decided to write about this code fragment. While I was preparing to do a comprehensive analysis of the project and write an article, the Flipper Zero developers requested a trial version of PVS-Studio. They informed us that they might check the code and even wrote an article to share their experience.

Then, I opened the recent version of the project and wondered: "Where's the warning described by my colleague?" I took a look at the code – the developers had already fixed it! They added "+1".

By the way, I don't understand why they did this. In my opinion, this is not the best decision. Why not just write strcpy?

I noticed this error, looking through the code again. Although the code perfectly ran with this error, I still decided to fix this.

So, the fixed code fragment made me sad :(. Missed the chance. I couldn't write about fixed errors... Because at that moment, I still didn't know how the developers fixed it.

Then, just in case, I decided to check another previously written error.

static FS_Error storage_process_common_rename(Storage* app, const char* old,
                                              const char* new)
{
  FS_Error ret = FSE_INTERNAL;
  StorageType type_old = storage_get_type_by_path(old);
  StorageType type_new = storage_get_type_by_path(new);

  if(storage_type_is_not_valid(type_old) || storage_type_is_not_valid(type_old))
  {
    ret = FSE_INVALID_NAME;
  }
  else
  ....
}

The PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions 'storage_type_is_not_valid(type_old)' to the left and to the right of the '||' operator. storage-processing.c 380

Awesome! The error hadn't disappeared!

A typo: the type_old variable is checked twice. And the type_new variable is not checked.

Sorry, I know it's strange to be excited about errors in the program. But that's my job to search for interesting errors :).

These errors saved my creative impulse to write this article. So, I kept looking for more. Luckily, a new report contained actual errors. Not so many, but there were some. So, I decided to describe them in this article.

However, I wondered when exactly the Flipper Zero developers started using PVS-Studio to check the project? I asked them to clarify this. My reply depended on one of the following scenarios:

  • We haven't tried PVS-Studio yet. Our team found and fixed the error without the tool. In this case, I would have reply: with PVS-Studio, you can find and fix such errors faster.
  • PVS-Studio helped us to find and fix the error. I would have say: that's why PVS-Studio is so helpful.

In any case, PVS-Studio is useful :).

We haven't tried PVS-Studio yet. The tool setup takes some time in terms of dealing with errors that are not actually errors (for example, everything about mallocs). Our team plans to introduce PVS-Studio soon. We will also use it for other projects we have, for example, for a wi-fi programmer.

Clearly, this is the first scenario. Although their explanation means that this was an incomplete error. The developers added "+1" for accuracy. They could do it in advance.

By the way, PVS-Studio has user-friendly and quick integration! The tool provides mass suppression of warnings (set the baseline). You can delay the current technical debt and handle only new warnings.

You can find a brief description here.

A more detailed description of how to introduce a code analyzer in a large codebase, you can find in the following article: "How to introduce a static code analyzer in a legacy project and not to discourage the team."

More errors that I managed to find

Let's take a look at the most interesting parts of the code that the PVS-Studio analyzer found. If you want to check your projects, download a free trial version.

Extra return

void onewire_cli_search(Cli* cli) {
  ....
  bool done = false;
  ....
  onewire.start();
  furi_hal_power_enable_otg();

  while(!done) {
    if(onewire.search(address, true) != 1) {
      printf("Search finished\r\n");
      onewire.reset_search();
      done = true;
      return;
    } else {
      printf("Found: ");
      for(uint8_t i = 0; i < 8; i++) {
        printf("%02X", address[i]);
      }
    printf("\r\n");
    }
    delay(100);
  }

  furi_hal_power_disable_otg();
  onewire.stop();
}

PVS-Studio found two abnormalities in the above code fragment:

  • V654 [CWE-834] The condition '!done' of loop is always true. ibutton-cli.cpp 253
  • V779 [CWE-561, CERT-MSC12-C] Unreachable code detected. It is possible that an error is present. ibutton-cli.cpp 269

And indeed! Firstly, the loop condition is always true. After the value of the done variable is changed in the loop body, the function immediately ends execution. So, this change doesn't matter.

Secondly, the epilogue of the function is not executed. This code never gets the control:

furi_hal_power_disable_otg();
onewire.stop();

As a result, the program logic is broken.

Checking the pointer returned by malloc functions

The developers treat the result of the malloc function a bit frivolously. In some places, the application stops working, if memory could not be allocated. The example:

void random_permutation(unsigned n)
{
  if (permutation_tab) free(permutation_tab);
  permutation_tab = (unsigned *) malloc(n * sizeof(unsigned));
  if (permutation_tab == NULL) abort();
  ....
}

Note. I think there's no point in deleting code examples here and in other places, demonstrating another code, or changing the narration. I want the article to be as it is happened to be, because I don't know the project structure. Let me show you some fragments of our discussion. This makes the article more enthralling.

The Flipper Zero Team. This is an external library.

I. Then, this library is weird because it calls abort. Moreover, it is used in an embedded device. For example, AUTOSAR (AUTomotive Open System ARchitecture) prohibits this — V3506.

The Flipper Zero Team. This code is part of the benchmark.

The Flipper Zero Team. That's right, this is a header-only library. We don't really care about the quality of its tests.

I. Fair enough. In this case, everything is OK, but I won't cut this from the article. You may probably wonder – what if the libraries that the developers use for their embedded devices, contain abort/exit.

In other places, the null pointer is interpreted more calmly:

ptr = malloc(sizeof(uint8_t) * BlockSize);
if(ptr == NULL) {
  goto error;
}

External code.

Somewhere is a check, implemented only for debug versions:

size_t bench_mlib(unsigned n)
{
  string_t *tab = (string_t*) malloc (n * sizeof (string_t));
  assert (tab != 0);
  ....
}

By the way, to my mind, this is a doubtful solution. In fact, the check does not benefit the users. Only the developers take advantage of it. I think the developers have to fully process the memory allocation error. Or, at least, they should not pretend that the check exists, and delete assert :).

Why did you choose to perform a check that way?

This is an external library.

Here comes the most interesting part. The code contains unchecked places. The allocated memory is used right away. For example:

void storage_ext_init(StorageData* storage) {
  SDData* sd_data = malloc(sizeof(SDData));
  sd_data->fs = &USERFatFS;
  ....
}

The PVS-Studio warning: V522 [CWE-690, CERT-MEM52-CPP] There might be dereferencing of a potential null pointer 'sd_data'. Check lines: 516, 515. storage-ext.c 516

There are other similar warnings:

  • V522 [CWE-690, CERT-MEM52-CPP] There might be dereferencing of a potential null pointer 'app'. Check lines: 8, 7. dialogs.c 8
  • V522 [CWE-690, CERT-MEM52-CPP] There might be dereferencing of a potential null pointer 'app'. Check lines: 162, 161. notification-settings-app.c 162
  • V522 [CWE-690, CERT-MEM52-CPP] There might be dereferencing of a potential null pointer 'bench_data'. Check lines: 81, 79. storage_settings_scene_benchmark.c 81
  • V522 [CWE-690, CERT-MEM52-CPP] There might be dereferencing of a potential null pointer 'app'. Check lines: 18, 16. storage_settings.c 18
  • V575 [CWE-628, CERT-EXP37-C] The potential null pointer is passed into 'strlen' function. Inspect the first argument. Check lines: 174, 168. storage-test-app.c 174

Note. I'm sure someone will say: there's no point in checking such pointers. To prove that it's not quite right, I invite you to read the following article: "Why it is important to check what the malloc function returned".

So, I couldn't but ask the project developers: Why don't you have checks here? Is this a mistake? Or did you plan to do this because you are sure nothing can go wrong?

If the embedded memory has run out, it is dangerous to continue working. The allocator we use stops the execution of the firmware code if the allocation is unsuccessful. It calls for a programmer with a debugger.

More about null pointers

Judging by the furi_record_data_get_or_create function, theoretically, it can return a null pointer:

static FuriRecordData* furi_record_data_get_or_create(string_t name_str) {
  furi_assert(furi_record);
  FuriRecordData* record_data =
    FuriRecordDataDict_get(furi_record->records, name_str);
  if(!record_data) {
    FuriRecordData new_record;
    new_record.flags = osEventFlagsNew(NULL);
    ....
  }
  return record_data;
}

Now let's see how this function is used.

void furi_record_create(const char* name, void* data) {
  ....
  FuriRecordData* record_data = furi_record_data_get_or_create(name_str);
  furi_assert(record_data->data == NULL);
  record_data->data = data;
  ....
}

The PVS-Studio warning: V522 [CWE-476, CERT-EXP34-C] Dereferencing of the null pointer 'record_data' might take place. record.c 65

The function returns the pointer that is used without a prior check.

However, I was wrong here. In fact, this is a false positive. The authors explained that I was not attentive enough, reviewing the furi_record_data_get_or_create function. I won't delete my wrong description of the function. Let's analyze this case in more detail.

Take a look at the entire function:

static FuriRecordData* furi_record_data_get_or_create(string_t name_str) {
  furi_assert(furi_record);
  FuriRecordData* record_data =
    FuriRecordDataDict_get(furi_record->records, name_str);
  if(!record_data) {
    FuriRecordData new_record;
    new_record.flags = osEventFlagsNew(NULL);
    new_record.data = NULL;
    new_record.holders_count = 0;
    FuriRecordDataDict_set_at(furi_record->records, name_str, new_record);
    record_data = FuriRecordDataDict_get(furi_record->records, name_str);
  }
  return record_data;
}

If we get the record right away, then we return it. If we haven't received the record, then we create it and return it. Everything is fine.

But the analyzer was not savvy enough. Since the code contains a check, the pointer can be NULL. If so, the function can return NULL. For some reason, the analyzer did not consider that the pointer is initiated in any case.

Conclusion: The Flipper Zero developers did a better job. Our team should improve the Data-Flow algorithm in PVS-Studio for such cases.

Let's continue talking about null pointers. Something triggered the diagnostic based on a different logic. The V595 diagnostic issues a warning when the pointer is dereferenced, and then is suddenly checked. Very suspicious. This diagnostic often helps to detect many errors. Fortunately, Flipper Zero is not such a project. They/we failed to receive a bunch of pretty V595 :). However, I noticed one helpful warning:

void subghz_scene_receiver_info_on_enter(void* context) {
  ....
  subghz->txrx->protocol_result->to_string(subghz->txrx->protocol_result, text);
  widget_add_string_multiline_element(....);

  string_clear(frequency_str);
  string_clear(modulation_str);
  string_clear(text);

  if(subghz->txrx->protocol_result &&
     subghz->txrx->protocol_result->to_save_file &&
     strcmp(subghz->txrx->protocol_result->name, "KeeLoq")) {
  ....
}

The PVS-Studio warning: V595 [CWE-476, CERT-EXP12-C] The 'subghz->txrx->protocol_result' pointer was utilized before it was verified against nullptr. Check lines: 70, 78. subghz_scene_receiver_info.c 70

Although I consider various errors related to null pointers in the article, I must compliment the project developers for high-quality code. For C code, the density of such errors is low.

What programming and testing practices have you used to reduce the number of errors related to null pointers?

We used MPU to detect hard fault by accessing the null address. Strange that this approach is not common for embedded. We also have the analyzer for the consumption and release of memory by the application. Unfortunately, this is not fully implemented without MMU. But it gives a certain idea of the allocation correctness.

Also -Werror.

However, I can't say that we haven't had sleepless nights debagging a corrupt heap :).

Someone hurried

bool subghz_get_preset_name(SubGhz* subghz, string_t preset) {
  const char* preset_name;
  switch(subghz->txrx->preset) {
  case FuriHalSubGhzPresetOok270Async:
    preset_name = "FuriHalSubGhzPresetOok270Async";
    break;
  case FuriHalSubGhzPresetOok650Async:
    ....
  case FuriHalSubGhzPreset2FSKDev476Async:
    preset_name = "FuriHalSubGhzPreset2FSKDev476Async";
    break;
      FURI_LOG_E(SUBGHZ_PARSER_TAG, "Unknown preset");   // <=
  default:
  ....
}

The PVS-Studio warning: V779 [CWE-561, CERT-MSC12-C] Unreachable code detected. It is possible that an error is present. subghz_i.c 44

The break statement and the logging macro clearly must be swapped. Most likely, someone hurried to edit the code or merged changes from various branches. As a result, the error appeared.

But how did it actually happen? I know that the error is not critical, but I still wonder :).

Moreover, the logging macro must be in the default label. We have lots of merges and not enough programmers — sometimes we do not notice such mistakes.

When everyone is probably wrong

This is the case when you notice something wrong in the code, but you can't understand how critical the case is. And it's unclear whether the PVS-Studio analyzer is accurate, issuing the warnings.

The analyzer issued several warnings similar to the one below. We're going to consider only one case.

void subghz_cli_command_tx(Cli* cli, string_t args, void* context) {
  uint32_t frequency = 433920000;
  uint32_t key = 0x0074BADE;
  size_t repeat = 10;

  if(string_size(args)) {
    int ret = sscanf(string_get_cstr(args),
                     "%lx %lu %u", &key, &frequency, &repeat);
  ....
}

The PVS-Studio warning: V576 [CWE-628, CERT-FIO47-C] Incorrect format. Consider checking the fifth actual argument of the 'sscanf' function. A pointer to the unsigned int type is expected. subghz_cli.c 105

Pay attention to the format string that controls the data during scanning: "%lx %lu %u". It implies that pointers to variables of the following types are expected:

  • %lx - long unsigned int;
  • %lx - long unsigned int;
  • %u - unsigned int.

At the same time, the program, storing the scanned data will use variables of the following type:

  • uint32_t;
  • uint32_t;
  • size_t.

I don't know what data sizes are used to compile the Flipper Zero project. Also, I can't say how unsafe this code is. However, the developers should definitely replace "%u" with "%zu" (see the description of the sscanf function).

I would say more about the code and the analyzer warning if the project developers tell me what type sizes are possible on the build platforms they use. In other words, I'd like to know the possible data models that the developers used when compiling the project.

That's an error. The third variable – repeat must be of the int32_t type.

So, again there's the discrepancy. The control modifier "l" (long) is used to scan the first two 32-bit variables. But for the third variable "l" is not used. Moreover, signed/unsigned are mismatched.

  • %lx (long unsigned int) -> uint32_t;
  • %lx (long unsigned int) -> uint32_t;
  • %u (unsigned int) -> int32_t;

I guess the size of the int type matches the size of the long int type, and it's impossible to enter a negative number. Thus, this and another code run correctly. Nevertheless, I suggest reviewing all the V576 warnings of the PVS-Studio analyzer and write control (format) strings more carefully where they are necessary.

Conclusion

Flipper Zero is a high-quality project, although it is written mainly in the C language. So, the article turned out to be not so long. Let's face it, C code is more vulnerable to errors than C++ code. Fortunately, we have static code analyzers that can detect various errors. I can't prove this statement. However, I have a sense that dozens of checks show the importance of static analysis.

Actually, the project contains a few errors, so I'm not sure I'd write the article if it was some other project. But I really liked this project – I wanted to write about this cool tool and keep in touch with the developers. By the way, now, I give the floor to them to say the final words.

Our main goal when writing the Flipper firmware is to create a convenient and user-friendly project. We want our users to be pleased with the tool. Now it is far from perfect, but we are working hard to improve the code readability and documentation. We believe that the community will help us to make a great product.

Thank you for your attention and welcome to our blog! Here you'll find other articles about embedded and IoT.