Andrey Karpov , Dmitry Sviridkin

Nov 01 2024

Tags:

#Cpp #Knowledge

C++ programmer's guide to undefined behavior: part 8 of 11

Nov 01 2024

Author: Andrey Karpov , Dmitry Sviridkin

Program execution: infinite loops and halting problem
Program execution: recursion
Program execution: false noexcept
Program execution: buffer overflow
Program execution: garbage collector support (not relevant for C++23 and newer)
All chapters

Your attention is invited to the eighth part of an e-book on undefined behavior. This is not a textbook, as it's intended for those who are already familiar with C++ programming. It's a kind of C++ programmer's guide to undefined behavior and to its most secret and exotic corners. The book was written by Dmitry Sviridkin and edited by Andrey Karpov.

Program execution: infinite loops and halting problem

In general, it's algorithmically impossible to determine whether or not a program terminates on a given data set.

However, for some reason, the C and C++ standards state that a valid program should either certainly terminate or certainly produce the observed effects, such as requesting I/O, interacting with volatile variables, and so on. Otherwise, the program behavior is undefined. So, the "real" C++ compilers are so hardcore that they can solve algorithmically intractable problems.

If a program contains an infinite loop, and the compiler decides that this loop has no visible effect, then it's meaningless and can be discarded.

Here's an interesting example of how to "disprove" Fermat's Last Theorem:

#include <iostream>

int fermat () {
  const int MAX = 1000;
  int a=1,b=1,c=1;
  while (1) {
    if ( (a*a*a) == (b*b*b) + (c*c*c) ) return 1;
    a++;
    if (a>MAX) {
      a=1;
      b++;
    }
    if (b>MAX) {
      b=1;
      c++;
    }
    if (c>MAX) {
      c=1;
    }
  }
  return 0;
}

int main () {
  if (fermat()) {
    std::cout <<
      "Fermat's Last Theorem has been disproved.\n";
  } else {
     std::cout <<
       "Fermat's Last Theorem has not been disproved.\n";
  }
  return 0;
}

By building the code using GCC 14.1 and the -O3 key, we get the following message: Fermat's Last Theorem has been disproved.

The compiler saw that the only way out of the loop is return 1. The loop has no visible effects, so the compiler just replaced it with return 1.

If we try to learn what triplet the program "has found", the loop returns.

We get a compilation error in the constexpr context. The compiler stops when a certain analysis depth is exceeded: 'constexpr' loop iteration count exceeds the limit of 262144 (use '-fconstexpr-loop-limit=' to increase the limit).

It may seem as if the issue is that the condition for the loop to continue doesn't depend on its body. However, the loop disappears in the fixed version as well:

int fermat() {
  const int MAX = 1000;
  int a=1,b=1,c=1;
  while ((a*a*a) != ((b*b*b)+(c*c*c))) {
    a++;
    if (a>MAX) {
      a=1;
      b++;
    }
    if (b>MAX) {
      b=1;
      c++;
    }
    if (c>MAX) {
      c=1;
    }
  }
  return 1;
}

Even if there are I/O operations in the loop, it may still disappear if the compiler sees that these operations are idempotent in the loop.

int fermat () {
  const int MAX = 1000;
  int a=1,b=1,c=1;
  while (1) {
    if ( (a*a*a) == (b*b*b) + (c*c*c) ) {
      std::cout << "Found!\n";
      return 1;
    }
    a++;
    if (a>MAX) {
      a=1;
      b++;
    }
    if (b>MAX) {
      b=1;
      c++;
    }
    if (c>MAX) {
      c=1;
    }
  }
  return 0;
}

We build the code using GCC 14.1 -O3 -std=c++20 and get the following:

Found!
Fermat's Last Theorem has been disproved.

So, one shouldn't just assume that a program has to loop in some cases and write tests in C and C++ for those cases. One can't debug prints on the fly either. Also, creating tests to check that the program doesn't loop may be useless.

Useful links

Wikipedia. Halting problem.
Wikipedia. Rice's theorem.
Algorithmic limitations of static code analyzers that lead to some false positives and false negatives are related to the halting problem.

Program execution: recursion

Many algorithms are very nicely and efficiently written in recursive form: sorting, graph traversal, string algorithms.

However, recursion requires space to store the intermediate state, either on the heap or on the stack. Of course, there's a tail call, which can be naturally optimized into a loop. But the standard doesn't guarantee it. The recursion isn't always the tail call.

A stack overflow isn't exactly undefined behavior, but it's definitely not something you want to see in the release build. This is why in large applications, devs prefer iterative algorithms to the recursive ones. Unless, of course, there's no guarantee that the recursion depth is small.

When it comes to eliminating recursion from a program, one needs to be very careful. This applies not only to the correct algorithm implementation. Besides algorithms, data structures can also be recursive. That's where RAII, the rule of zero, the order in which destructors are called, and noexcept come into play.

struct Node {
  int value = 0;
  std::vector<Node> children;
};

This structure is perfectly legitimate for defining a tree, it's compilable, and it works. It also may be more convenient than using smart pointers.

We don't need to manually manage resources in any way, the vector takes care of everything on its own. We use the "rule of zero" and write no destructor, no copy constructor, no move/copy operator, nothing. It's beautiful!

However, the destructor generated by the compiler will be recursive! And if the tree is too deep, we will get a stack overflow.

Okay, we write our own destructor: we need a queue to traverse the tree vertices... A queue is a memory allocation, which is an operation that throws exceptions. So, we have a destructor that throws exceptions, which isn't good at all.

We can write a destructor without allocations and recursion, but it will be of quadratic algorithm complexity:

We find the vertex that has the last element in the descendant vector as a leaf;
We delete this element from the vector;
We repeat until the tree is finished.

The issue also remains for a regular linked list:

struct List {
  int value = 0;
  std::unique_ptr<List> next;
};

However, in this case, this is a tail call, and we can hope the optimizer can handle the recursion. But we also run tests on debug builds, don't we?

So, we write a destructor and all other special member functions (in this case, only move operations):

struct List {
  int value = 0;
  std::unique_ptr<List> next;

  ~List() {
    while (next) {
      // The destructor is still recursive,
      // but now the recursion depth is 1 call.
      next = std::move(next->next);
    }
  }

  List() noexcept = default;
  List(List&&) noexcept = default;
  List& operator=(List&&) noexcept = default;
};

One has to be very careful with recursive data structures in C++. It's hard to write them in the "obvious" way in Rust for a reason.

Program execution: false noexcept

Since C++11, we can mark functions and member functions with the noexcept specifier, thus telling the compiler that the function or member function don't throw exceptions.

Everything seems to be fine: after receiving such information, the compiler may not generate additional instructions to handle the stack unwinding. Binaries are getting smaller, and programs are getting faster.

But the issue is that this specifier doesn't force compilers to check that a function really doesn't throw exceptions.

If we mark a function as noexcept, and it throws an exception, something strange will happen, resulting in a sudden std::terminate.

For example, try-catch blocks will suddenly stop working:

void may_throw(){
  throw std::runtime_error("wrong noexcept");
}

struct WrongNoexcept {
  WrongNoexcept() noexcept {
     may_throw();
  }
};

// Attempts to wrap this function or any code using it
// in try-catch are useless.
void throw_smth() {
  if (rand() % 2 == 0) {
    throw std::runtime_error("throw");
  } else {
    WrongNoexcept w;
  }
}

If we build this code using GCC or Clang, we will get a strong access violation:

terminate called after throwing an instance of 'std::runtime_error'
  what():  wrong noexcept
Program terminated with signal: SIGSEGV

It can be very difficult to understand why this happened if the code is spread across different translation units.

Conditional noexcept

In C++, programmers like to skimp on keywords:

= 0 is for declaring pure virtual functions;
the new requires has two meanings, creating the strange requires(requires(...)) constructions;
auto is for both automatic type inference and switching to trailing return type;
decltype which has different meanings when applied to a variable and to an expression;
and, of course, noexcept, which has exactly the same values as the requires.

There's the noexcept(condition) specifier. A simple noexcept is syntactic sugar for the noexcept(true) construct.

Then there's the noexcept(expr) predicate which checks that the expr expression doesn't throw exceptions by its very nature (addition of numbers, for example) or is marked as noexcept.

Together, they form a construct for adding the conditional noexcept:

void fun() noexcept(noexcept(used_expr))

void may_throw(){
  throw std::runtime_error("wrong noexcept");
}

struct ConditionalNoexcept {
  ConditionalNoexcept() noexcept(noexcept(may_throw())) {
     may_throw();
  }
};

// Now this function is fine.
void throw_smth() {
  if (rand() % 2 == 0) {
    throw std::runtime_error("throw");
  } else {
    ConditionalNoexcept w;
  }
}

To avoid issues, use conditional noexcept always and everywhere and carefully check every function you use. Or don't use noexcept at all. In the second case, however, it's worth remembering that both move operations and swap should be marked as noexcept (and really be noexcept!) to effectively work with standard containers.

Don't forget to write negative tests. You may miss a false noexcept and get std::terminate in the release build without them.

Also, note a subtle and unpleasant thing: if you really need to throw exceptions from the destructor, make sure you explicitly write noexcept(false) in its declaration. By default, all your functions and member functions are implicitly marked as noexcept(false), but in C++, an exception is made for destructors. They're implicitly marked as noexcept(true). So:

struct SoBad {
  // invoke std::terminate
  ~SoBad() {
     throw std::runtime_error("so bad dtor");
  }
};

struct  NotSoBad {
  // OK
  ~NotSoBad() noexcept(false) {
    throw std::runtime_error("not so bad dtor");
  }
};

Useful links

Cppreference. noexcept operator.
Cppreference. noexcept specifier.
Rainer Grimm. C++ Core Guidelines: The noexcept Specifier and Operator.

Program execution: buffer overflow

Buffer overflows and array overruns are nasty errors that not only cause simple program crashes, but also create security holes that enable attackers to go where they shouldn't, or even execute arbitrary code.

The standard C library, which C++ has inherited, has many "leaky" functions that can cause a buffer overflow if the programmer hasn't bothered to check all possible and impossible options:

scanf("%s", buf) indicates that there's no buffer size check;
strcpy(dst, src) indicates that there's no buffer size check;
strcat(dst, src) indicates that there's no buffer size check;
gets(str) indicates that there's no buffer size check;
memcpy(dst, src, n) indicates that the dst size check should be done manually;
strncat(dst, src, count) indicates that not only manual checks are needed, but also one should remember that the last argument isn't the buffer size. It shows how many characters can still be written to the buffer. This is a common confusion.

There are many other functions that mainly work with strings.

These functions have caused and continue to cause problems. Some compilers, MSVC for example, refuse to build your code by default if they see any of those. Others are less considerate and will probably issue a warning. At least when it comes to the gets function, that's for sure. If with other functions a programmer can protect themselves (they can check before calling them; they can specify a size to limit the string for scanf), then with gets there's no way.

Most of the old unsafe C functions now have "safe" analogs with buffer sizes. Some of them aren't standardized, and some of them are. All of this has spawned a huge number of macro substitution kludges that deal with the mess. But that's not what we're talking about now.

Size checks are extra work. Generating instructions for them slows down the program. Besides, the programmer could check everything on their own. So, in C and C++, writing or reading outside the array boundaries causes undefined behavior. And security breaches can be flooded with various special effects.

In most cases, if size violations don't occur, trying to read beyond the array boundaries results in either garbage results or the simple and much-loved segmentation error (SIGSEGV).

Sometimes funny things happen:

const int N = 10;
int elements[N];

bool contains(int x) {
  for (int i = 0; i <= N; ++i) {
    if (x == elements[i]) {
      return true;
    }
  }
  return false;
}

int main() {
  for (int i = 0; i < N; ++i) {
    std::cin >> elements[i];
  }
  return contains(5);
}

This program, built using GCC with optimizations, always "finds" a five in the array. It doesn't matter what numbers you enter. And neither Clang nor GCC issue any warnings. Well, at least PVS-Studio does:

V557 Array overrun is possible. The value of 'i' index could reach 10.

This special effect occurs for the following reasons:

1. Compilers assume that correct C++ programs are free of UB.

2. In this loop, there will be an array index out of bounds, so this is UB.

for (int i = 0; i <= N; ++i) {
  if (x == elements[i]) {
    return true;
  }
}

3. However, since correct C++ programs are free of UB, it shouldn't come down to the N+1 iteration!

4. So, we'll exit the loop by return true.

5. This means that the whole contains function is a single return true. It's optimized!

Here's another case where a finite loop becomes an infinite one:

const int N = 10;
int main() {
  int decade[N];
  for (int k = 0; k <= N; ++k) {
    printf("k is %d\n",k);
    decade[k] = -1;
  }
}

The trick here is just as clever as in the example above:

decade[k] = -1; Accessing an array element should be UB-free. So, k < N;
Since k < N, the k <= N loop condition is always true. There's no need to check it. It's optimized!

In these examples, the "<=" operator in the loop conditions should immediately catch your eye. Even with the more familiar "<" operator you can create issues for yourself, too. For example, the N constant may not be related to the array size. And that's it.

In friendly and safe languages, we'll get a runtime error. We'll also get a panic or exception. In C++, we need to check, check, and check again everything:

Don't use separately dangling constants when checking sizes, use std::size() or the size() member function instead.
Write fewer raw loops with counters. It's better to use range-based for loops or standard algorithms from #include <algorithm>.
Don't use operator[] when performance isn't critical. It's safer to have the at() member function of the container checking the bounds.

Useful links

Brendan Watters. Stack-Based Buffer Overflow Attacks: Explained and Examples.
Dhaval Kapil. Buffer Overflow Exploit.
Mikhail Gelvikh How not to check array size in C++.

Program execution: garbage collector support (not relevant for C++23 and newer)

Yes, you heard it right. Your eyes don't deceive you. I haven't lost my mind. And you, too. Most likely.

C++ is a unique language. Its standard includes a description of something that will almost certainly not appear in the language. There's garbage collector support, but no garbage collector. And this support is done in the most natural way for C++: by implementing undefined behavior.

Undefined behavior occurs in the following case:

you have a pointer to the allocated memory in the heap;
this is the only pointer to this memory;
you hide it somehow: you destroy the pointer without freeing memory, but keep the capability to restore this pointer somehow;
you restore the pointer;
dereferencing the pointer causes undefined behavior.

Well, indeed, if we ever have a garbage collector, destroying the last pointer to an object will let the garbage collector delete that object. So, further access to this object will do no good. The garbage collector may have time to remove it. Or it may not. And here's UB.

But we don't have the garbage collector! None of the compilers support it, but the standard does!

So, for example, if you store some memory-saving meta-information in the low-order bits of a pointer (which you can sometimes do because of alignment), you probably have UB in your program related to garbage collector support. It will probably never go off, but it's there.

template <class T>
struct MayBeUninitialized {
  static_assert(alignof(T) >= 2);
    
  MayBeUninitialized() {
    // We allocate raw memory explicitly calling the operator new.
    // All this nonsense with garbage collector support is described
    // only for the global operator new. std::malloc.
    // placement new and others aren't involved.
    ptr_repr_ = reinterpret_cast<uintptr_t>(
                  operator new (sizeof(T), 
                                std::align_val_t(alignof(T))));
    // The one and only pointer was just created
    // and immediately destroyed.
    ptr_repr_ |= 1; // set unitialized flag
  }

  ~MayBeUninitialized() {
    Deinit();
    operator delete(GetPointer(), sizeof(T),
                    std::align_val_t(alignof(T)));
  }

  void Deinit() {
    if (!IsInitialized()) {
      return;
    }
    GetPointer()->~T();
  }

  bool IsInitialized() const {
    return !(ptr_repr_ & 1);
  }

  void Set(T x) {
    Deinit();
    new (GetPointer()) T(std::move(x));
    // drop unitialized flag
    ptr_repr_ &= (~static_cast<uintptr_t>(1));
  }


  const T& Get() const {
    if (!IsInitialized()) {
      throw std::runtime_error("not init");
    }
    return *GetPointer(); // UB
  }

private:
  T* GetPointer() const {
    constexpr auto mask = ~static_cast<uintptr_t>(1);
    auto ptr = reinterpret_cast<T*>(ptr_repr_ & mask);
    // The pointer has been restored, but dereferencing it is UB.
    return ptr;
  }

  uintptr_t ptr_repr_;
};

We eliminate this nonsense of undefined behavior, which is meaningless for the current C++ state, by using a pair of declare_reachable and undeclare_reachable functions:

MayBeUninitialized() {
  void* ptr = operator new (sizeof(T),
                            std::align_val_t(alignof(T)));
  std::declare_reachable(ptr);
  ptr_repr_ = reinterpret_cast<uintptr_t>(ptr);
  // The one and only pointer was just created
  // and immediately destroyed, but we marked the memory under it
  // as reachable to ward off the mythical garbage collector.
  ptr_repr_ |= 1; // set unitialized flag
}
    
~MayBeUninitialized() {
  Deinit();
  void* ptr = GetPointer();
  std::undeclare_reachable(ptr);
  operator delete (ptr, sizeof(T), std::align_val_t(alignof(T)));
}

These functions currently do nothing. We need them only to formally comply with the standard.

If you believe that C++ will have a garbage collector someday, please use these wonderful functions, so that your program will work correctly in the distant future.

If you don't, you can forget about them. This is probably the only UB that doesn't show up anywhere or in any way. And it won't. Most likely. There are even proposals to remove this absolutely terrible for C++ "feature".

It's important to realize that the C++ garbage collector isn't something supernatural. For example, garbage collectors for the JVM are written in C and C++. Nothing stops us from using them in C++ programs: we just can use alternative functions to allocate memory. We can even use them to override the behavior of the operators new and delete. However, very little C++ code is written assuming that a garbage collector works under these operators.

You can check if your program is running in a bright world with a garbage collector by calling the get_pointer_safety function. It returns one of three values:

pointer_safety::strict means that you can't just fiddle with pointer recovery from anywhere. The garbage collector may be working.
pointer_safety::relaxed means that there's no issue with pointers, the allocated memory won't go anywhere by itself.
pointer_safety::preferred means that there's no issue with pointers; the allocated memory won't go anywhere by itself, but the leak detector that needs the declare_reachable/undeclare_reachable markers may be working.

int main() {
  switch (std::get_pointer_safety())
  {
  case std::pointer_safety::strict:
    std::cout << "strict" << std::endl;
    break;
  case std::pointer_safety::relaxed:
    std::cout << "relaxed" << std::endl;
    break;
  default:
    std::cout << "preferred" << std::endl;
  }
}

I'd like to note that when running this code under valgrind-3.15.0 for Ubuntu 20.04 (x86_64), the displayed message (relaxed) doesn't change in any way.

Useful links

Cppreference. std::undeclare_reachable.
Cppreference. std::declare_reachable.
C++ Standards Committee Papers. Removing Garbage Collection Support.
C++ Standards Committee Papers. Minimal Support for Garbage Collection and Reachability-Based Leak Detection.
Wikipedia. Boehm garbage collector.
C++ FAQ. Garbage collection ABI.

Author: Dmitry Sviridkin

Dmitry has over eight years of experience in high-performance software development in C and C++. From 2019 to 2021, Dmitry Sviridkin has been teaching Linux system programming at SPbU and C++ hands-on courses at HSE. Currently works on system and embedded development in Rust and C++ for edge servers as a Software Engineer at AWS (Cloudfront). His main area of interest is software security.

Editor: Andrey Karpov

Andrey has over 15 years of experience with static code analysis and software quality. The author of numerous articles on writing high-quality code in C++. Andrey Karpov has been honored with the Microsoft MVP award in the Developer Technologies category from 2011 to 2021. Andrey is a co-founder of the PVS-Studio project. He has long been the company's CTO and was involved in the development of the C++ analyzer core. Andrey is currently responsible for team management, personnel training, and DevRel activities.

All chapters

Part 1: introduction; what is undefined behavior and what it leads to; narrowing conversions and implicit type conversion.
Part 2: overflow of signed integers; floating-point numbers; integer promotion; char and sign extension.
Part 3: dangling references; string_view; a fly in the syntactic sugar (range-based for); self-reference; std::vector and reference invalidation.
Part 4: lambda function capture lists; tuples; unexpected mutability; implicit references; use-after-move; lifetime extension.
Part 5: Most Vexing Parse; non-constant constants; move semantics; std::enable_if_t vs. std::void_t; forgotten return.
Part 6: ellipsis and functions; operator []; iostreams—good luck debugging!; comma operator; function-try-block; zero-sized types.
Part 7: null-terminated strings; std::shared_ptr; imexplicit type conversion; how to pass a standard function and not break anything.
Part 8: infinite loops and halting problem; recursion; false noexcept; buffer overflow.
Part 9: (N)RVO vs RAII; null pointer dereferencing; static initialization order fiasco; static inline; ODR violation; reserved names.
Part 10: trivial types and ABI; uninitialized variables; C++20 unbounded ranges; non-virtual yet virtual functions; VLA.
Part 11: invalid pointers; placement new for arrays; data race; mutex deadlock; signal (un)safety; how to do everything right and trigger the deadlock.
Part 12: std::vector::reserve and std::vector::resize; unaligned references; time of life and death; static analysis and UB; conclusion.