Andrey Karpov

Feb 06 2024

Tags:

#Cpp #Knowledge

Shall we check pointer for NULL before calling free function?

Feb 06 2024

Author: Andrey Karpov

The free function
Where do questions about preliminary pointer checking come from?
Cringey macro
Zeroing pointer after free
Conclusion
Additional links

The short answer is no. However, since this question keeps popping up on Reddit, Stack Overflow, and other websites, it's time to address the topic. There are a lot of interesting things to ponder over.

The free function

The free function is declared in the <stdlib.h> header file as follows:

void free( void *ptr );

The function frees the memory buffer that was previously allocated using the malloc, calloc, realloc, and aligned_alloc functions.

If ptr is a null pointer, the function does nothing.

So, there's no need to check the pointer before calling free.

if (ptr)     // a redundant check
  free(ptr);

Such code is redundant because the check serves no useful purpose. If the pointer is null, you can safely pass it to the free function. The developers of the C standard deliberately chose this:

cppreference.com: free

The function accepts (and does nothing with) the null pointer to reduce the amount of special-casing

If the pointer is non-null but still invalid, then the check doesn't protect against anything. An invalid non-null pointer is still passed to the free function, resulting in undefined behavior.

cppreference.com: free

The behavior is undefined if the value of ptr does not equal a value returned earlier by malloc(), calloc(), realloc(), or aligned_alloc() (since C11).

The behavior is undefined if the memory area referred to by ptr has already been deallocated, that is, free(), free_sized(), free_aligned_sized(), or realloc() has already been called with ptr as the argument and no calls to malloc(), calloc(), realloc(), or aligned_alloc() resulted in a pointer equal to ptr afterwards.

So, it's possible and it's better to write simply:

free(ptr);

Where do questions about preliminary pointer checking come from?

Documentation for the free function explicitly states that you can pass a null pointer to it and it's safe. However, discussions on this topic continue to appear on various websites. Questions fall into two categories.

Beginners' questions. These types of questions are the most common. It's simple: people are just learning programming and haven't yet figured out when to check pointers and when not to. A simple explanation is enough for them. When you allocate memory using malloc, check the pointer. Otherwise, dereferencing a null pointer may result in undefined behavior. Before the memory is freed (using free), there's no need to check the pointer, because the function does that itself.

Well, that's it. Unless you can advise a beginner to use an online analyzer to find out what's wrong with their code faster.

Questions asked by experienced and overly meticulous programmers. This is where things get interesting. These people know what's in the documentation. However, they still ask because they aren't sure that calling free(NULL) is always safe. They worry about compiling their code on very old systems, where free doesn't guarantee safe null pointer handling. Or that a particular third-party library that implements free in a non-standard way (by not checking for NULL) may be used.

We can discuss the theory. In reality, however, it doesn't make sense.

Let's start with ancient systems. Firstly, it's not that easy even to find such a system. The first C89 standard states that the free function must safely handle NULL.

C89: 4.10.3.2 The free function.

The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.

Secondly, if you encounter a "pre-standard" system, you probably won't be able to build your application for a variety of reasons. I also doubt you would ever need to do that. The issue seems far-fetched.

Now, imagine that the system isn't prehistoric but, let's say, special. An unusual third-party library of system functions, which implements the free function in its own way: you can't pass NULL to.

In that case, broken free isn't your biggest problem. If one of the basic language functions is broken in the library, then you have so many other broken things to worry about besides the safe call to free.

It's like getting into a DIY car with brakes that don't work, a jammed steering wheel, no rearview mirrors, and worrying about whether the terminals (contacts) are securely connected to the battery. Terminals are important, but they're not the problem, the situation as a whole is.

Sometimes the topic of preliminary pointer checking is discussed from the code micro-optimization perspective: "You can avoid calling the free function if you check the pointer yourself". This is a case where perfectionism is definitely working against you. We'll explore this idea in more detail below.

Cringey macro

The most useless and potentially dangerous thing you can do is to implement pointer check using such a macro:

#define SAFE_FREE(ptr) if (ptr) free(ptr)

We even have an interview question: "What's wrong with this macro?" Everything is wrong with it. It seems that such a macro shouldn't exist in reality, but we've encountered it in projects. Let's take it apart piece by piece.

Firstly, this macro is a redundant entity.

As I said earlier, the free function safely handles null pointers, and this check doesn't provide any additional safety when working with pointers.

Secondly, this macro is also redundant in terms of micro-optimizations.

I read in the comments that the additional check optimizes the code a bit, since the compiler doesn't have to make an expensive call to the free function. I think this is nonsense, not an optimization.

The cost of the function call is exaggerated. In any case, it's nothing compared to resource-intensive operations like allocating and deallocating memory buffers. When it comes to optimization, it's worth working on reducing the number of memory allocation operations instead of doing the check before calling the free function.

The standard scenario in programming is to successfully allocate memory and then free it again. Null pointers passed to free are most likely special, rare, and non-standard cases. There's no point in "optimizing" them. An additional check would most likely be a pessimization. This is because two checks are now performed instead of one before memory is released. Maybe the compiler will optimize it, but then it's even more unclear why all the fuss. By the way, since we're talking about optimizations, the manual optimization using macros looks naive and useless. It's better to write simple, understandable code instead of trying to do micro-optimizations that a compiler does better than a human.

I think this attempt at unnecessary optimization perfectly confirms Donald Knuth's famous statement:

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Thirdly, the macro causes errors.

When you use the macro, it's very easy to write incorrect code.

#define SAFE_FREE(ptr) if (ptr) free(ptr)
....
if (A)
  SAFE_FREE(P);
else
  foo();

The code doesn't look the way it works. Let's expand the macro.

if (A)
  if (P) free(P);
else
  foo();

The else statement refers to the second if statement and is executed when the pointer is null. Well, things don't work as intended. The SAFE_FREE macro turned out to be not so "SAFE".

There are other ways to accidentally create incorrect code. Let's look at the code for cleaning up a two-dimensional array.

int **A = ....;
....
int **P = A;

for (....)
  SAFE_FREE(*P++);
SAFE_FREE(A);

Yes, the code is a bit far-fetched, but it shows how dangerous the macro is when it deals with complex expressions. One pointer is checked and the next pointer after it is freed:

for (....)
  if (*P++) free(*P++);

There's also an array overrun.

All in all, it's bad.

Can we fix a macro?

We can, although we don't need to. Let's look at possible ways of fixing it for educational purposes only. That's also one of our interview questions.

First, we need to protect the macro from the else issue. The easiest but most ineffective way is to add braces:

#define SAFE_FREE(ptr) { if (ptr) free(ptr); }

The code we discussed above will no longer compile (error: 'else' without a previous 'if'):

if (A)
  SAFE_FREE(P);
else
  foo();

So, we can use the following trick:

#define SAFE_FREE(ptr) do { if (ptr) free(ptr); } while(0)

Now the code compiles again. The first issue is resolved, but what about the re-calculations? A recommended standard solution doesn't exist. However, there are workarounds if you really want one.

A similar issue arises when implementing macros such as max. Here's an example:

#define max(a, b) ((a) > (b) ? (a) : (b))
....
int X = 10;
int Y = 20;
int Z = max(X++, Y++);

21 instead of 20 will be written to the Z variable because the Y variable will have been incremented by the time it's selected:

int X = 10;
int Y = 20;
int Z = ((X++) > (Y++) ? (X++) : (Y++));

To avoid this, you can use magic — the GCC compiler extension: referring to a Type with typeof.

#define max(a,b) \
  ({ typeof (a) _a = (a); \
      typeof (b) _b = (b); \
    _a > _b ? _a : _b; })

The point is to copy values into temporary variables and thus eliminate the repeated calculation of expressions. The typeof operator is an analog of decltype from C++, but for C.

Again, please note that this is a non-standard extension. I don't recommend using it unless you really need to.

Let's apply this method to SAFE_FREE:

#define SAFE_FREE(ptr) do { \
  typeof(ptr) copy = (ptr); \
  if (copy) \
    free(copy); \
} while(0)

It works. Even though we created terrible, unbearable, and actually unnecessary code to do it.

A more elegant solution is to convert the macro to a function. This way we can eliminate the issues discussed above and simplify the code:

void SAFE_FREE(void *ptr)
{
  if (ptr)
    free(ptr);
}

Wait, wait, wait! We're back to the function call again! Except now, we have an extra function layer. The free function does the same job of checking the pointer.

So, the best way to fix the SAFE_FREE macro is to remove it!

Zeroing pointer after free

There is one topic that has almost nothing to do with pointer checking, but let's discuss it as well. Some programmers recommend zeroing the pointer after memory is freed. Just in case.

free(pfoo);
pfoo = NULL;

You could say that the code is written in the defensive programming paradigm. I'm talking about additional optional actions that sometimes insure against errors.

In our case, if the pfoo pointer isn't used, there's no point in zeroing it. However, we can do this for the following reasons.

Pointer access. If data is accidentally written to the pointer, it's not a memory corruption, but a null pointer dereference. Such an error is detected and corrected more quickly. The same thing happens when reading data from a pointer.

Double-free. Zeroing the pointer protects against errors when the buffer is freed again. However, the benefits aren't as clear as they first appear. Let's look at the code that contains the error:

float *ptr1;
char *ptr2;
....
free(ptr1);
ptr1 = NULL;
....
free(ptr1);  // the ptr2 variable should have been used here
ptr2 = NULL;

A programmer made a typo: instead of writing ptr2, they used the ptr1 pointer to free memory again. Due to zeroing the ptr1 pointer, nothing happens when you free it again. The code is protected from the double-free error. On the other hand, zeroing the pointer hides the error deeper. There's a memory leak that can be difficult to detect.

Defensive programming is criticized because of such cases (masking errors, replacing one error with another). It's a big topic, and I'm not ready to dive into it. However, I think it's a right thing to warn you about the downsides of defensive programming.

What's the best way to proceed if you decide to zero out pointers after freeing memory?

Let's start with the dangerous way:

#define FREE_AND_CLEAR(ptr) do { \
  free(ptr); \
  ptr = NULL; \
} while(0)

The macro isn't designed to be used this way:

int **P = ....;
for (....)
  FREE_AND_CLEAR(*P++);

One pointer is freed and the next pointer is zeroed out. Let's polish the macro:

#define FREE_AND_CLEAR(ptr)  do { \
  void **x = &(ptr); \
  free(*x); \
  *x = NULL; \
} while(0)

It does the trick, but frankly, this kind of macro isn't my jam. I'd rather explicitly zero out the pointer:

int **P = ....;
for (....)
{
  free(*P);
  *P = NULL;
  P++;
}

The code above is too long, I don't like it as well. There's no macro magic, though. I don't like macros. The fact that the code is so long and ugly is a good reason to consider rewriting it. Is it really necessary to iterate through and release pointers in such a clumsy way? Perhaps we could make the code more elegant. This is a good reason to do some refactoring.