To get a trial key
fill out the form below
Team License (standard version)
Enterprise License (extended version)
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
GBP
RUB
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

** This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Message submitted.

Your message has been sent. We will email you at


If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

>
>
>
Null Pointer Dereferencing Causes Undef…

Null Pointer Dereferencing Causes Undefined Behavior

Feb 16 2015
Author:

I have unintentionally raised a large debate recently concerning the question of whether it is legal in C/C++ to use the &P->m_foo expression with P being a null pointer. The programmers' community divided into two camps. The first claimed with confidence that it isn't legal, while the others were as sure that it is. Both parties gave various arguments and links, and it occurred to me that at some point I had to make things clear. For that purpose, I contacted Microsoft MVP experts, and the Visual C++ Microsoft development team communicating through a closed mailing list. They helped me to prepare this article and now everyone interested is welcome to read it. For those who can't wait to learn the answer: That code is NOT correct.

0306_Reflections_Null_Pointer_Dereferencing2/image1.png

Debate history

It all started with an article about a Linux kernel check with the PVS-Studio analyzer. But the issue doesn't have anything to do with the check itself. The point is that in that article I cited the following fragment from Linux' code:

static int podhd_try_init(struct usb_interface *interface,
        struct usb_line6_podhd *podhd)
{
  int err;
  struct usb_line6 *line6 = &podhd->line6;

  if ((interface == NULL) || (podhd == NULL))
    return -ENODEV;
  ....
}

I called this code dangerous because I thought it to cause undefined behavior.

After that, I got a pile of emails and comments, readers objecting to that idea of mine, and was even close to giving in to their convincing arguments. For instance, as proof of that code being correct they pointed out the implementation of the offsetof macro, typically looking like this:

#define offsetof(st, m) ((size_t)(&((st *)0)->m))

We deal with null pointer dereferencing here, but the code still works well. There were also some other emails reasoning that since there had been no access by null pointer, there was no problem.

Although I tend to be gullible, I still try to double-check any information I may doubt. I started investigating the subject, and eventually wrote a small article: "Reflections on the Null Pointer Dereferencing Issue".

Everything suggested that I had been right: One cannot write code like that. But I didn't manage to provide convincing proof for my conclusions, and cite the relevant excerpts from the standard.

After publishing that article, I was again bombarded by protesting emails, so I thought I should figure it all out once and for all. I addressed language experts with a question, to find out their opinions. This article is a summary of their answers.

About C

The '&podhd->line6' expression is undefined behavior in the C language when 'podhd' is a null pointer.

The C99 standard says the following about the '&' address-of operator (6.5.3.2 "Address and indirection operators"):

The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.

The expression 'podhd->line6' is clearly not a function designator, the result of a [] or * operator. It is an lvalue expression. However, when the 'podhd' pointer is NULL, the expression does not designate an object since 6.3.2.3 "Pointers" says:

If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

When "an lvalue does not designate an object when it is evaluated, the behavior is undefined" (C99 6.3.2.1 "Lvalues, arrays, and function designators"):

An lvalue is an expression with an object type or an incomplete type other than void; if an lvalue does not designate an object when it is evaluated, the behavior is undefined.

So, the same idea in brief:

When -> was executed on the pointer, it evaluated to an lvalue where no object exists, and as a result the behavior is undefined.

About C++

In the C++ language, things are absolutely the same. The '&podhd->line6' expression is undefined behavior here when 'podhd' is a null pointer.

The discussion at WG21 (232. Is indirection through a null pointer undefined behavior?), to which I referred to in the previous article, brings in some confusion. The programmers participating in it insist that this expression is not undefined behavior. However, no one has found any clause in the C++ standard permitting the use of "podhd->line6" with "podhd" being a null pointer.

The "podhd" pointer fails the basic constraint (5.2.5/4, second bullet) that it must designate an object. No C++ object has nullptr as address.

Summing it all up

struct usb_line6 *line6 = &podhd->line6;

This code is incorrect in both C and C++, when the podhd pointer equals 0. If the pointer equals 0, undefined behavior occurs.

The program running well is pure luck. Undefined behavior may take different forms, including program execution in just the way the programmer expected. It's just one of the special cases of undefined behavior, and that's all.

You cannot write code like that. The pointer must be checked before being dereferenced.

Additional ideas and links

  • When considering the idiomatic implementation of the 'offsetof()' operator, one must take into account that a compiler implementation is permitted to use what would be non-portable techniques to implement its functionality. The fact that a compiler's library implementation uses the null pointer constant in its implementation of 'offsetof()' doesn't make it OK for user code to use '&podhd->line6' when 'podhd' is a null pointer.
  • GCC can / does optimize, assuming no undefined behavior ever occurs, and would remove the null checks here -- the Kernel compiles with a bunch of switches to tell the compiler not to do this. As an example, the experts refer to the article "What Every C Programmer Should Know About Undefined Behavior #2/3".
  • You may also find it interesting that a similar use of a null pointer was involved in a kernel exploit with the TUN/TAP driver. See "Fun with NULL pointers". The major difference that might cause some people to think the similarity doesn't apply is that in the TUN/TAP driver bug, the structure field that the null pointer accessed was explicitly taken as a value to initialize a variable, instead of simply having the address of the field taken. However, as far as standard C goes, taking the address of the field through a null pointer is still undefined behavior.
  • Is there any case when writing &P->m_foo where P == nullptr is OK? Yes, for example when it is an argument of the sizeof operator: sizeof(&P->m_foo).

Acknowledgements

This article was made possible thanks to the experts whose competence I can see no reason to doubt. I want to thank the following people for helping me in writing it:

  • Michael Burr is a C/C++ enthusiast who specializes in systems level and embedded software including Windows services, networking, and device drivers. He can often be found on the StackOverflow community answering questions about C and C++ (and occasionally fielding the easier C# questions). He has 6 Microsoft MVP awards for Visual C++.
  • Billy O'Neal is a (mostly) C++ developer, and contributor to StackOverflow. He is a Microsoft Software Development Engineer on the Trustworthy Computing Team. He has worked at several security related places previously, including Malware Bytes and PreEmptive Solutions.
  • Giovanni Dicanio is a computer programmer, specializing in Windows operating system development. Giovanni wrote computer programming articles on C++, OpenGL, and other programming subjects on Italian computer magazines. He contributed code to some open-source projects as well. Giovanni likes helping people solving C and C++ programming problems on Microsoft MSDN forums, and recently on StackOverflow. He has 8 Microsoft MVP awards for Visual C++.
  • Gabriel Dos Reis is a Principal Software Development Engineer at Microsoft. He is also a researcher and a longtime member of the C++ community. His research interests include programming tools for dependable software. Prior to joining Microsoft, he was Assistant Professor at Texas A&M University. Dr. Dos Reis was a recipient of the 2012 National Science Foundation CAREER award for his research in compilers for dependable computational mathematics and educational activities. He is a member of the C++ standardization committee.

References

Popular related articles
PVS-Studio for Java

Date: Jan 17 2019

Author: Andrey Karpov

In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've co…
The way static analyzers fight against false positives, and why they do it

Date: Mar 20 2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
PVS-Studio ROI

Date: Jan 30 2019

Author: Andrey Karpov

Occasionally, we're asked a question, what monetary value the company will receive from using PVS-Studio. We decided to draw up a response in the form of an article and provide tables, which will sho…
Static analysis as part of the development process in Unreal Engine

Date: Jun 27 2017

Author: Andrey Karpov

Unreal Engine continues to develop as new code is added and previously written code is changed. What is the inevitable consequence of ongoing development in a project? The emergence of new bugs in th…
How PVS-Studio Proved to Be More Attentive Than Three and a Half Programmers

Date: Oct 22 2018

Author: Andrey Karpov

Just like other static analyzers, PVS-Studio often produces false positives. What you are about to read is a short story where I'll tell you how PVS-Studio proved, just one more time, to be more atte…
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, 10-15% of False Positives

Date: Jul 31 2017

Author: Andrey Karpov

After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many erro…
Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities

Date: Nov 21 2018

Author: Andrey Karpov

A brief description of technologies used in the PVS-Studio tool, which let us effectively detect a large number of error patterns and potential vulnerabilities. The article describes the implementati…
The Last Line Effect

Date: May 31 2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…
The Evil within the Comparison Functions

Date: May 19 2017

Author: Andrey Karpov

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want t…
The Ultimate Question of Programming, Refactoring, and Everything

Date: Apr 14 2016

Author: Andrey Karpov

Yes, you've guessed correctly - the answer is "42". In this article you will find 42 recommendations about coding in C++ that can help a programmer avoid a lot of errors, save time and effort. The au…

Comments (0)

Next comments

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →
Accept