To get a trial key
fill out the form below
Team License (standard version)
Enterprise License (extended version)
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
GBP
RUB
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
Sixth Chromium Check, Afterword

Sixth Chromium Check, Afterword

Feb. 4, 2019
Author:

At the beginning of 2018 our blog was complemented with a series of articles on the sixth check of the source code of the Chromium project. The series includes 8 articles on errors and recommendations for their prevention. Two articles sparked heated discussion, and l still occasionally get comments by mail about topics covered in them. Perhaps, I should give additional explanations and as they say, set the record straight.

0607_Chromium-6-comments/image1.png

A year has passed since writing a series of articles on a regular check of the Chromium project source code:

Articles devoted to memset and malloc have caused and continue causing debates, which strike me as strange. Apparently, there was some confusion due to the fact that I had been insufficiently accurate when verbalizing my thoughts. I decided to go back to those articles and to make some clarifications.

memset

Let's start with an article about memset, because here everything is simple. Some arguments appeared about the best way to initialize structures. Quite many programmers wrote that it would be better to give the recommendation not to write:

HDHITTESTINFO hhti = {};

but to write in the following way:

HDHITTESTINFO hhti = { 0 };

Reasons:

  • The construction {0} is easier to notice when reading code, than {}.
  • The construction {0} is more intuitively understandable, than {}. Which means, 0 suggests that the structure is filled with zeros.

Accordingly, readers suggest me to change this initialization example in the article. I do not agree with the arguments and do not plan to make any edits in the article. Now I'm going to explain my opinion and provide some reasons.

As for visibility, I think, it's a matter of taste and habit. I don't think that the presence of 0 inside the parentheses fundamentally changes the situation.

As for the second argument, I totally disagree with it. The record of the type {0} gives a reason to incorrectly perceive the code. For example, you can suppose that if you replace 0 with 1, all fields will be initialized with ones. Therefore, such style of writing is more likely to be harmful rather than helpful.

The PVS-Studio analyzer even has a related diagnostic V1009, the description of which is cited below.

V1009. Check the array initialization. Only the first element is initialized explicitly.

The analyzer has detected a possible error related to the fact that when declaring an array the value is specified only for one element. Thus, the remaining elements will be implicitly initialized by zero or by a default constructor.

Let's consider the example of suspicious code:

int arr[3] = {1};

Perhaps the programmer expected than arr would consist entirely of ones, but it is not. The array will consist of the values 1, 0, 0.

Correct code:

int arr[3] = {1, 1, 1};

Such confusion may occur due to the similarity with the construction arr = {0}, which initializes the whole array with zeros.

If such constructions are actively used in your project, you can disable this diagnostic.

We also recommend not to neglect the clarity of your code.

For example, the code for encoding values of a color is recorded as follows:

int White[3] = { 0xff, 0xff, 0xff };
int Black[3] = { 0x00 };
int Green[3] = { 0x00, 0xff };

Thanks to implicit initialization, all colors are specified correctly, but it's better to rewrite the code more clearly:

int White[3] = { 0xff, 0xff, 0xff };
int Black[3] = { 0x00, 0x00, 0x00 };
int Green[3] = { 0x00, 0xff, 0x00 };

malloc

Before reading further, please recall the contents of the article "Why it is important to check what the malloc function returned ". This article has given rise to much debate and criticism. Here are some of the discussions: reddit.com/r/cpp, reddit.com/r/C_Programming, habr.com (ru). Occasionally readers still e-mail me about this article.

The article is criticized by readers for the following points:

1. If malloc returned NULL, then it is better to immediately terminate the program, than to write a bunch of if-s and try to somehow handle the memory, due to which the program execution is frequently impossible anyway.

I haven't pushed for fighting till the end with the consequences of memory leak, by passing the error higher and higher. If it's permitted for your application to terminate its work without a warning, then let it be so. For this purpose even a single check right after malloc or using xmalloc is enough (see the next point).

I objected and warned about the lack of checks because of which the program continues working as if nothing had happened. It's a completely different case. It is dangerous, because it leads to undefined behavior, data corruption, and so on.

2. There is no description of a solution that lies in writing wrapper functions for allocating memory with a check following it or using already existing functions, such as xmalloc.

Agree, I missed this point. When writing the article I just wasn't thinking about the way to remedy the situation. It was more important for me to convey to the reader the danger of the check absence. How to fix an error is a question of taste and implementation details.

The xmalloc function is not a part of the standard C library (see "What is the difference between xmalloc and malloc?"). However, this function may be declared in other libraries, for example, in the GNU utils library (GNU libiberty).

The main point of the function is that the program crashes when it fails to allocate memory. Implementation of this function might look as follows:

void* xmalloc(size_t s)
{
  void* p = malloc(s);
  if (!p) {
    fprintf (stderr, "fatal: out of memory (xmalloc(%zu)).\n", s);
    exit(EXIT_FAILURE);
  }
  return p;
}

Accordingly, by calling a xmalloc function instead of malloc every time, you can be sure that undefined behavior will not occur in the program due to usage of a null pointer.

Unfortunately, xmalloc is not a cure-all either. One should remember that usage of xmalloc is unacceptable when it comes to writing code of libraries. I'll talk about it later.

3. Most comments were the following: "in practice, malloc never returns NULL".

Fortunately, I am not the only one, who understands that this is the wrong approach. I really liked this comment in my support:

According to my experience of discussing this topic, I've got a feeling that there are two sects in the Internet. Adherents of the first one strongly believe that malloc never returns NULL under Linux. Supporters of the second one wholeheartedly claim that if memory cannot be allocated in your program, nothing can be done, you can only crash. There is no way to over-persuade them. Especially when these two sects intersect. You can only take it as a given. And it's even not important on which specialized resource a discussion takes place.

I thought for a while and decided to follow the advice, so I'll not try to persuade anyone :). Hopefully, these groups of developers write only nonfatal programs. If, for example, some data in the game gets corrupted, there is nothing crucial in it.

The only thing that matters is that developers of libraries, databases mustn't do like this.

Appeal to the developers of highly dependable code and libraries

If you are developing a library or other highly dependable code, always check the value of the pointer returned by malloc/realloc function and return outwards an error code if memory couldn't be allocated.

In libraries, you cannot call the exit function, if memory allocation failed. For the same reason, you cannot use xmalloc. For many applications, it is unacceptable to simply abort them. Because of this, for example, a database can be corrupted. One can lose data that was evaluated for many hours. Because of this, the program may be subjected to "denial of service" vulnerabilities, when, instead of correct handling of the growing workload, a multithreaded application simply terminates.

It cannot be assumed, in what ways and in what projects the library will be used. Therefore, it should be assumed that the application may solve very critical tasks. That's why just killing it by calling exit is no good. Most likely, such a program is written taking into account the possibility of memory lack and it can do something in this case. For example, a CAD system cannot allocate an appropriate memory buffer that will be enough for regular operation due to the strong fragmentation of memory. In this case, it is not the reason for it to crush in the emergency mode with data loss. The program can provide an opportunity to save the project and restart itself normally.

In no event it is impossible to rely on malloc that it will always be able to allocate memory. It is not known on which platform and how the library will be used. If low memory situation on one platform is exotic, it can be quite a common situation on the another one.

We cannot expect that if malloc returns NULL, then the program will crash. Anything can happen. As I described in the article, the program may write data not by the null address. As a result, some data may be corrupted, which leads to unpredictable consequences. Even memset is dangerous. If padding with data goes in reverse order, first some data gets corrupted, and then the program will crash. But the crash may occur too late. If tainted data is used in parallel threads while the memset function is working, consequences can be fatal. You can get a corrupted transaction in a database or sending commands to the removal of "unnecessary" files. Anything has a chance to happen. I suggest a reader to dream up yourself, what might happen due to the usage of garbage in memory.

Thus, the library has only one correct way of working with the malloc functions. You need to IMMEDIATELY check that the function returned, and if it is NULL, then return an error status.

Additional links

Popular related articles
How PVS-Studio Proved to Be More Attentive Than Three and a Half Programmers

Date: 10.22.2018

Author: Andrey Karpov

Just like other static analyzers, PVS-Studio often produces false positives. What you are about to read is a short story where I'll tell you how PVS-Studio proved, just one more time, to be more atte…
Free PVS-Studio for those who develops open source projects

Date: 12.22.2018

Author: Andrey Karpov

On the New 2019 year's eve, a PVS-Studio team decided to make a nice gift for all contributors of open-source projects hosted on GitHub, GitLab or Bitbucket. They are given free usage of PVS-Studio s…
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, 10-15% of False Positives

Date: 07.31.2017

Author: Andrey Karpov

After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many erro…
PVS-Studio for Java

Date: 01.17.2019

Author: Andrey Karpov

In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've co…
The Last Line Effect

Date: 05.31.2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…
The way static analyzers fight against false positives, and why they do it

Date: 03.20.2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
Appreciate Static Code Analysis!

Date: 10.16.2017

Author: Andrey Karpov

I am really astonished by the capabilities of static code analysis even though I am one of the developers of PVS-Studio analyzer myself. The tool surprised me the other day as it turned out to be sma…
Static analysis as part of the development process in Unreal Engine

Date: 06.27.2017

Author: Andrey Karpov

Unreal Engine continues to develop as new code is added and previously written code is changed. What is the inevitable consequence of ongoing development in a project? The emergence of new bugs in th…
Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities

Date: 11.21.2018

Author: Andrey Karpov

A brief description of technologies used in the PVS-Studio tool, which let us effectively detect a large number of error patterns and potential vulnerabilities. The article describes the implementati…
The Ultimate Question of Programming, Refactoring, and Everything

Date: 04.14.2016

Author: Andrey Karpov

Yes, you've guessed correctly - the answer is "42". In this article you will find 42 recommendations about coding in C++ that can help a programmer avoid a lot of errors, save time and effort. The au…

Comments (0)

Next comments

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →
Accept