Webinar: Evaluation - 05.12
Your attention is invited to the fourth part of an e-book on undefined behavior. This is not a textbook, as it's intended for those who are already familiar with C++ programming. It's a kind of C++ programmer's guide to undefined behavior and to its most secret and exotic corners. The book was written by Dmitry Sviridkin and edited by Andrey Karpov.
C++11 has given us lambda functions, as well as a new way to implicitly get dangling references.
A lambda function that captures something by reference is safe as long as it is not returned anywhere outside the creation scope. As soon as we return or save a lambda somewhere, the real fun begins:
auto make_add_n(int n) {
return [&](int x) {
return x + n; // n becomes a dangling reference!
};
}
...
auto add5 = make_add_n(5);
std::cout << add5(5); // UB!
This isn't groundbreaking, as we can see all the same issues as returning a reference from a function. Clang can sometimes issue a warning.
The code above is compiled using GCC 14.1 (-O3 -std=c++20). It outputs a value of 5.
If we build the code using Clang 18.1 (-O3 -std=c++20), the result is 1711411576. Here's the warning:
<source>:5:13: warning:
address of stack memory associated with parameter 'n' returned
5 | return [&](int x) {
| ^
<source>:6:20: note: implicitly captured by reference due to use here
5 | return [&](int x) {
| ~
6 | return x + n;
| ^
However, once we take the make_add_n argument by reference:
auto make_add_n(const int &n) {
return [&](int x) {
return x + n; // n becomes a dangling reference!
};
}
Both compilers stay silent:
We can create a similar issue for member functions:
struct Task {
int id;
std::function<void()> GetNotifier() {
return [this]{
// this may become a dangling reference!
std::cout << "notify " << id << "\n";
};
}
};
int main() {
auto notify = Task { 5 }.GetNotifier();
notify(); // UB!
}
However, in this example, we can see this in the capture list and, naturally, get a bit concerned. Before C++20, you could shoot yourself in the foot a little less explicitly:
struct Task {
int id;
std::function<void()> GetNotifier() {
return [=]{
// 'this' may become a dangling reference!
std::cout << "notify " << id << "\n";
};
}
};
The = symbol requires capturing everything by value. However, it's not the id data member that is captured, but the this pointer.
If you see a lambda that has this, = (up to C++20), or & in its capture list, be sure to check how and where that lambda is used. Add overloads for checking the lifetime of captured variables.
struct Task {
int id;
std::function<void()> GetNotifier() && = delete;
std::function<void()> GetNotifier() & {
return [this]{
// it's harder for this to become a dangling reference now
std::cout << "notify " << id << "\n";
};
}
};
If possible, it's better to use capture by value or move initialization capture instead of capture by reference.
auto make_greeting(std::string msg) {
return [message = std::move(msg)] (const std::string& name) {
std::cout << message << name << "\n";
};
}
...
auto greeting = make_greeting("hello, ");
greeting("world");
Since C++11, there's a wonderful std::tuple class template in the standard library. This is a tuple, a heterogeneous list, both great and helpful. Except that creating a tuple without breaking anything while still getting exactly what you wanted is a really non-trivial task.
Explicitly specifying element types of a very long container isn't a fun thing to do.
C++11 has given us three ways to save on type specification (different functions for creating tuples):
C++17 also makes it possible to use class template argument deduction and just write like that:
auto t = tuple { 1, "string", 1.f };
All this great variety enables us to fine-tune what types we want in container elements exactly: references or not. It also enables the possibility of making a mistake and run into a lifetime issue.
The std::make_tuple function template removes references, decays arrays to pointers, and removes const. Basically, it applies std::decay_t.
There's a special case, as usual, born out of good intentions.
If the type of the make_tuple argument is std::reference_wrapper<T>, it's converted to T& in the tuple:
int x = 5;
float y = 6;
auto t = std::make_tuple(std::ref(x),
std::cref(y),
"hello");
static_assert(std::is_same_v<decltype(t), // The code compiles
std::tuple<int&,
const float&,
const char*>>);
The class template argument deduction doesn't consider the special case of std::reference_wrapper. However, the decay occurs. The following code also compiles:
int x = 5;
float y = 6;
auto t = std::tuple(std::ref(x), std::cref(y), "hello");
static_assert(std::is_same_v<decltype(t),
std::tuple<std::reference_wrapper<int>,
std::reference_wrapper<const float>,
const char*>>);
The std::forward_as_tuple function always constructs a tuple of references. So, you can get a reference to a dead temporary object:
int x = 5;
auto t = std::forward_as_tuple(x, 6.f, std::move("hello"));
static_assert(
std::is_same_v<
decltype(t),
std::tuple<int&,
float&&,
const char (&&) [6]>>); // This is the rvalue reference
// to an array
std::get<1>(t); // UB!
The std::tie function constructs a tuple only from lvalue references. It's harder to fail with it, but still possible if you want to return the resulting tuple from a function. However, this case is quite similar to the cases of returning any references from functions:
template <class... T>
auto tie_consts(const T&... args) {
return std::tie(args...);
}
int main(int argc, char **argv) {
auto t = tie_consts(1, 1.f, "hello");
static_assert(std::is_same_v<decltype(t),
std::tuple<const int&,
const float&,
const char (&)[6]>>);
std::cout << std::get<1>(t) << "\n"; // UB
}
Here are some general recommendations:
1. To create result tuples, use make_tuple with explicit cref/ref or use a constructor if references aren't needed.
2. Use std::tie only to temporarily represent a set of variables as a tuple:
std::tie(it, inserted) = map.insert({x, y}); // tuple unpacking
std::tie(x1, y1, z1) == std::tie(x2, y2, z2); // component-wise comparison
3. Use std::forward_as_tuple only when passing arguments. Don't save the result tuple anywhere.
Here's also a bonus.
Die-hard Python fans may want to try using std::tie to swap variable values:
// x, y = y, x
int x = 5;
int y = 3;
std::tie(x, y) = std::tie(y, x);
std::cout << x << " " << y;
This isn't Python, though. So, the code behavior is not defined but cheer up. It's only unspecified. As a result, you get either 5 5 or 3 3.
It was a warm spring day. Sipping my tea, I slowly and lazily flipped through student projects. I could've said that nothing seemed to be going wrong, but unfortunately the papers were done in C++.
Suddenly, I noticed an innocuous string used for diagnostic logging:
printf("from %s -- to %s",
storage[from].value.c_str(), storage[to].value.c_str());
There's nothing wrong with it, right? Although, at that moment I was overcome with terror. Let me share it with you now.
In this string, lurks an incredible opportunity for bugs, unexpected crashes, and undefined behavior!
Every line of code in C++ is highly dependent on its context. What can we assume just by looking at this printf?
Good. Now meet the bad part: the last assumption could be accidentally violated anytime in the future life of the codebase. And violating this assumption leads to the most surprising consequences! They'll be all the more surprising if this printf is hidden under a macro and exists only with specific compilation options. For example, if the maximum logging level is set at compile time.
If from or to isn't in the list of storage keys, it all depends on how storage handles access to the missing key. To do that, we need to see what type storage has:
Is that it? Aren't we forgetting something?
We forgot that it's not limited to standard STL containers. Containers can be from other libraries, too. This is very common when it comes to associative containers. Due to the standard requirements for stability of element references and iteration guarantees, the std::unordered_map class can't be implemented efficiently. It's cache-unfriendly and almost always loses in benchmarks. So, real-world applications often use alternative implementations that neglect one or another of the guarantees.
A popular option is the family of "flat" hash tables with open addressing. All elements are stored in one continuous memory section. Obviously, if there's no room for a new element in this section, the memory should be reallocated to insert it. And the stability of references to elements is out of the question.
Now let's get back to our code snippet again:
printf("from %s -- to %s",
storage[from].value.c_str(), storage[to].value.c_str());
If storage is a hash table with similar behavior and interface to the standard one (e.g. abseil::flat_hash_map), a call via operator[] modifies the container. Depending on how full the table is and the presence of keys, different options await us. Yet we need to boil them all down to one question: what key will cause the table to be reallocated when accessed?
Don't rush to think about from and to, though, because the order in which the function arguments are calculated isn't specified! The keys can be accessed in ANY order! That only adds to the spice of the bug investigation if you encounter it in your work.
But I'll let myself think that in our case it's from that's accessed first, and then to.
The option of both keys missing is equivalent to the option of only the to key missing. So, let's keep it that way.
auto& from_value = storage[from].value; // (1)
auto& to_value = storage[to].value // (2)
Is this a victory? Do we have the bug all sorted out?
Actually, no. The call to c_str(), which is present in the original string, was deliberately omitted above. It allowed the bug to go unnoticed and not mess up our tests! It's all due to SSO, small string optimization.
If storage[from].value is of the std::string type, then most modern implementations would only experience the dreaded crash when using short strings!
Simplified std::string looks like this:
class string {
char* data;
size_t data_size;
size_t capacity_size;
const char* c_str() const { return data; }
};
This is 3 * 8 bytes on a 64-bit platform. And those strings lie in a heap. This is an incredible waste if the string is very short (0-15 characters)! So, with enough effort and persistence, it's possible to use union to make the structure look like this for short lines, for example:
class string
{
size_t capacity;
union
{
struct
{
char *ptr;
size_t size;
} heapbuf;
char stackbuf[sizeof(heapbuf)];
};
const char* c_str() const {
if (capacity > sizeof(heapbuf))
return heapbuf.ptr;
else
return stackbuf;
}
};
In the new implementation, we can arrange strings of small length within an object in stackbuf without allocating a buffer on the heap. Based on the capacity data member, it's determined where characters are stored:
Once again, we're back to extracting strings:
const char* from_value = storage[from].value.c_str(); (1)
const char* to_value = storage[to].value.c_str();
(1) is a pointer to the data in the heap or in the string structure? Who knows, really!
If from_value points to the heap, and the container efficiently uses move semantics, then the string is moved. The pointer is simply copied, and from_value remains valid.
Otherwise, the string is copied, and almost certainly the storage[from].value.c_str() pointer doesn't equal to from_value.
Although, there's a slim chance that the reallocation was implemented via realloc, and we were so miraculously lucky that it was just enough to move the memory block boundary in realloc.
What conclusions can we draw from all this?
I don't know of any static analyzer settings that would help here. Only a thorough testing process can reveal such bugs.
Andrey Karpov:
— I've written out some ideas on how to enhance PVS-Studio to detect errors of the described type. However, it's clear that the diagnostic rule would be inefficient, since it would work only for a limited set of simple cases. We need to know the exact value of the item we are looking for and what the container is filled with. This is a very challenging task for static analyzers, both in terms of analyzing the data flow and the computational cost of such an analysis. So, I agree with Dmitry that a programmer should rely only on their own vigilance when writing and testing code.
We can prevent such bugs by changing the way we write code. I strongly advise you to try programming in Rust (even if you won't use it in your work project) to develop the habit of writing code that meets the requirements of its borrow checker.
In C++ code, if we guarantee that a container and the data in it can either have only const references or no more than one mutable reference at a time, the error becomes almost impossible. We can't guarantee it, though. However, we can set a restriction that specifies all references to be constant:
const auto& const_storage = storage;
// operator[] unavailable due to const
const auto& from_value = const_storage.at(from).value;
// operator[] unavailable due to const
const auto& to_value = const_storage.at(to).value;
// If any of the keys are missing, an exception is thrown
We really like generic code here in C++. And not only in C++. It's convenient, reusable, and flexible. That's what templates are for!
Let's write some generic code:
template <class T>
auto pop_last(std::vector<T>& v) {
assert(!v.empty());
auto last = std::move(v.back());
v.pop_back();
return last;
}
It's quite wise to have such a function, because the existing pop_back returns void in many containers. In reality, this is very inconvenient because most of the time we want to take out the last container element and do something with it, not just throw it away.
Is this function okay? Of course, there will be undefined behavior on an empty vector, but we've written assert, so it's up to the user to handle that. Just make sure to write the correct code and don't write the incorrect one... Exception guarantees also raise questions, as the standard pop_back() doesn't return anything because of them. However, this is a topic for another chapter. And everything else seems fine, right?
Well, let's use this function!
std::vector<bool> v(65, true);
auto last = pop_last(v);
std::cout << last;
Is everything okay? Looks like it is. There's no crash. We can use different compilers to check it. Is there really no catch?
Actually, there is one. The number 65 was chosen for a reason, and most likely (it depends on the implementation) there's undefined behavior in the code that doesn't show up in any way, because that's how destructors of trivial types work. Well, one thing at a time.
We won't go into detail about different design patterns. There are good books for that. All in all, a Proxy is an object that intercepts calls to another object with the same (or similar) interface to do something. What exactly it'll do depends on the particular task and implementation.
The C++ standard library contains a variety of proxy objects (sometimes not pure proxies but with additional features):
The last one's the one we need.
In the C++98, the committee made a terrible decision that seemed reasonable at the time. They created a specialization for std::vector<bool>. Normally, sizeof(bool) == sizeof(char), but one bit is enough for bool. However, 99.99% of all possible platforms can't address memory one bit at a time. Let's pack bits in vector<bool> and store CHAR_BIT (usually 8) boolean values in one byte (char) for more efficient memory utilization.
As a result, one needs to work with std::vector<bool> in a very special way:
The reference type for vector<bool> looks like this:
class reference {
public:
operator bool() const { return (*concrete_byte_ptr) & (1 << bitno); }
reference& operator=(bool) {...}
....
private:
uchar8_t* concrete_byte_ptr;
uchar8_t bitno;
}
In the following line:
auto last = std::move(v.back());
The auto type deduction doesn't deduce the references, but it applies only to built-in C++ references. T& and T&& turn into T. The reference type doesn't turn into bool by itself, even though there's an implicit operator bool!
So, this is what we have:
auto pop_last(std::vector<bool>& v) {
// v.size() == 65
auto last = std::move(v.back());
// last is vector<bool>::reference; != bool
v.pop_back();
// v.size() == 64
// We threw the last uint8/uint32/uint64
// (implementation-dependent) from the vector.
// last still refers to the thrown element.
// If vector<bool> called (pseudo)destructor
// when throwing this element
// then when accessing this element via last, we violate
// the C++ object model by accessing destroyed object -> UB.
return last;
}
However, we didn't notice it when running the code because of the following things:
If we get an element from pop_last(), save it, and do something else to the vector that causes the buffer to be reallocated, then UB will emerge.
int main() {
std::vector<bool> v;
v.push_back(false);
std::cout << v[0] << " ";
const auto b = v[0];
auto c = b;
c = true;
std::cout << c << " " << b;
}
The code outputs 0 1 1. Despite const, the b value has changed. Well, this is obvious, isn't it? Since b isn't a reference, but an object that behaves like a reference!
This code becomes even more surprising and interesting in C++23: if authors of cppreference didn't make a mistake when copying the updates from the standard, the assignment operator overloading via const reference & is waiting for us. We can even do this:
int main() {
std::vector<bool> v;
v.push_back(false);
std::cout << v[0] << "\t"; // 0
const auto b = v[0];
b = true;
std::cout << v[0]; // 1
}
This behavior is quite defined but may be unexpected if you are writing all-purpose template code. Experienced C++ programmers are cautious about the explicit use of vector<bool>... Do they always check in the template function that accepts vector<T> if T != bool, though? They probably almost never do (unless they're writing a public library).
All right, we're done with the vector. Everything else is fine, right?
Sure!
Let's take a completely innocent function (thanks @sgshulman for the example):
template <class T>
T sum(T a, T b)
{
T res;
res = a + b;
return res;
}
And we accidentally put... that's right, some proxy type in there (what could it be?):
std::vector<bool> v{true, false};
std::cout << sum(v[0], v[1]) << std::endl;
If lucky, we get a compilation error. For example, in the MSVC implementation, vector<bool>::reference has no default constructor. GCC and Clang might compile the thing that crashes with memory access errors: T res refers to a non-existent vector.
We should also note how unexpectedly implicit calls to type conversion operators work here! After all, there's no operator+ defined in vector<bool>::reference. And return a + b; doesn't compile. Here, a and b are cast to bool, then to int to be summed, and then back to bool.
The std::vector<bool> is the best-known example of an object that generates a proxy. You can always write your own class and, if it emulates the trivial type behavior, it may amuse someone (your colleagues, for example).
The standard may allow to return the proxy for other types and operations. Developers of a standard library may leverage this. Or they may not. In any case, we may accidentally or intentionally write code that behaves differently depending on the library version. For example, according to the documentation, the operator* of std::valarray in libstdc++ v12.1 and Visual Studio 2022 have different return values.
Proxy objects can also be used in third-party libraries, where, of course, their use may vary from version to version.
For example, proxy objects are used for matrix operations in the Eigen library. The product of two matrices isn't a matrix but a special proxy object called Eigen::Product. The matrix transpose returns Eigen::Transpose. Many other operations also create proxy objects. So, if the following was working fine in one version:
const auto c = op(a, b);
b = d;
do_something(c);
Then it could easily break when you get an update. What if op now returns a lazy proxy, and you messed up one of the arguments with the next line?
In C++, there's no way. The only thing you can do is to be careful. Also, be careful when describing the constraints imposed on types in templates (preferably as C++20 concepts).
If you're designing a library, think twice about adding implicit proxies to the public API. If you really want to add them, you may want to consider whether you can do it without the implicit conversions. A lot of the issues we've covered here arise from implicit conversions. Maybe it'd be better to make the API a little wordier and less user-friendly but still secure?
If you're using a library, it may be better to explicitly specify the variable type. If you want to use bool, then specify it. Do you want a vector element type? Specify vector<T>::value_type. The auto keyword is very handy, but only if you know what you're doing.
The move semantics of C++11 are an important and necessary feature that enables you to write higher performance code that doesn't create unnecessary copies, allocations, and deallocations. This code will also explicitly declare the intention to transfer ownership of a resource from one function to another. Just like in Rust, which has been a Stack Overflow favorite for many years. However, there are still some differences.
An applicant will almost certainly be asked about move semantics in any serious job interview. A good candidate will somehow explain that, using the example of a vector, one object can take something from another object. These &&'s are just a syntactic workaround, because const& can bind to a temporary object, but we can't change anything below const afterward, while & can't bind to a temporary object, and by value has issues with the copy constructor... Anyway, things happen. Eventually, you and the applicant may end up writing a simple unique_ptr to demonstrate in code how exactly to steal pointers from one object to another. In theory, this should be enough 99% of the time.
Meanwhile, in the real world, you come across that intriguing 1%. We'll discuss those next.
Even though move semantics is quite efficient in C++, it's still not perfect. The developers tacked it on as a nice workaround but left a significant issue unresolved.
Let's take a look at a simple unique_ptr:
template<class T>
class UniquePtr {
public:
explicit UniquePtr(T* raw) : _ptr {raw} {}
UniquePtr() = default;
~UniquePtr() {
delete _ptr;
}
UniquePtr(const UniquePtr&) = delete;
UniquePtr(UniquePtr&& other) noexcept :
_ptr { std::exchange(other._ptr, nullptr) } {}
UniquePtr& operator=(const UniquePtr&) = delete;
UniquePtr& operator=(UniquePtr&& other) noexcept {
UniquePtr tmp(std::move(other));
std::swap(this->_ptr, tmp._ptr);
return *this;
}
private:
T* _ptr = nullptr;
};
....
UniquePtr<MyType> uptr = ...;
....
// something important is going on with uptr
....
UniquePtr<MyType> b = std::move(uptr);
// nothing stops us from doing
// uptr = fun(); here
As we know, std::move doesn't move anything. It simply performs a reference conversion to ensure that when a constructor or an assignment statement is called, the correct rvalue-reference overload is selected. The original object from which the move was made doesn't go anywhere (unlike in Rust, where the object becomes unavailable for use after the move). It'll have a destructor called at some point. So, we need to keep this object in a valid state to call the destructor. Let's leave nullptr in UniquePtr, just like in the move constructor.
However, what happens in the move assignment operator?
UniquePtr& operator=(UniquePtr&& other) noexcept {
UniquePtr tmp(std::move(other));
std::swap(this->_ptr, tmp._ptr);
return *this;
}
It uses move(copy)-and-swap for some reason... Well, there's a reason: we probably want to destroy the old object (T, not a pointer) and take ownership of the new one. Or do we? If not, why don't we implement the move operator like this?
UniquePtr& operator=(UniquePtr&& other) noexcept {
std::swap(this->_ptr, other._ptr);
return *this;
}
Everything about the semantics of moving in C++ is great!
However, this behavior is at least unexpected for UniquePtr. So, in the standard implementation, std::unique_ptr still zeroes the source pointer. The same is true for std::shared_ptr and std::weak_ptr. The standard guarantees that...
So, here lies the main trap: while the empty moved-out state for smart pointers is guaranteed, this actually isn't true for other classes from the standard library (and not only the standard library)! Not true at all!
The behavior of the move operator for a vector is described very intricately and considers a parameter that only those familiar with it—and interested in configuring it—will remember the allocator.
There's an allocator object hidden in each instance of std::vector. This can be either the default (std::allocator) empty object that uses global malloc/operator new, or something more specific. For example, you may want each of your vectors to use its own unique, pre-allocated portion of a large buffer that's completely under your control.
The standard library asks the allocator type to define the propagate_on_container_move_assignment property that affects how move assignment behaves. If we write A = std::move(B), we have three options:
In the libc++ implementation of the vector in the third case, the vector isn't left empty. The call to clear() is in libstdc++.
An example shows this:
template <class T>
struct MyAlloc {
using value_type = T;
using size_type = size_t;
using difference_type = ptrdiff_t;
using propagate_on_container_move_assignment = std::false_type;
T* allocate(size_t n) {
return static_cast<T*>(malloc(n * sizeof(T)));
}
void deallocate(T* ptr, size_t n) {
free(static_cast<void*>(ptr));
}
using is_always_equal = std::false_type;
bool operator == (const MyAlloc&) const {
return false;
}
};
int main() {
using VectorString = std::vector<std::string, MyAlloc<std::string>>;
{
VectorString v = {
"hello", "world", "my"
};
VectorString vv = std::move(v);
std::cout << v.size() << "\n";
// outoputs 0. It was a move constructor
}
{
VectorString v = {
"hello", "world", "my"
};
VectorString vv;
vv = std::move(v);
std::cout << v.size() << "\n";
// outputs 3. It was a move assignment
for (auto& x : v) {
// every element has been moved, there's nothing here
std::cout << x;
}
}
}
Let's compile and run it:
Note that only move assignment has the issue! Well, this is also a great example of how breaking the variable declaration and initialization can change the C++ program behavior!
By the way, strings were the elements of the vector. And the last loop addresses the moved-out strings!
Moved-out string state is also unspecified.
On various resources dedicated to C++, you may find an example that shows unexpected results when the code is compiled using the old Clang 3.7 with libc++:
void g(std::string v) {
std::cout << v << std::endl;
}
void f() {
std::string s;
for (unsigned i = 0; i < 10; ++i) {
s.append(1, static_cast<char>('0' + i));
g(std::move(s));
}
}
Since C++11, strings in the implementations of the three major compilers use SSO (Small String Optimization). With SSO, small strings aren't stored on the heap but within the string object (instead of/over the union pointers). Copying such strings becomes trivial, and trivial objects (primitives, structures of primitives) are also trivially moved by simply copying. In modern versions of GCC and Clang with libc++ and lidstdc++, the string remains empty after a move operation. Yet we shouldn't rely on it.
There are four levels associated with the moved-out object state guarantees:
Read the documentation before reusing an unfamiliar moved-out object! Better yet, avoid reusing it at all. Many static analyzers can issue a warning if you attempt to access a moved-out object in a function after calling std::move on it.
Also, when implementing the move operator, use the move_and_swap pattern (as demonstrated with UniquePtr at the beginning), so you have a better chance of leaving your objects in a truly empty state without much effort.
Extending the lifetime of temporary objects is a broad topic. It's come up more than once in this series of notes. After all, the feature works in a fairly limited number of cases, and more often than not you can get a dangling reference. In this section, however, I want to focus on a less obvious case with not-so-expected consequences.
In C++, the first time a temporary object is assigned to a const lvalue or rvalue reference, that object lifetime is extended to the reference lifetime:
std::string get_string();
void run(const std::string&);
int main() {
const std::string& s1 = get_string();
run(s1); // ok, the reference is valid
std::string&& s2 = get_string();
run(s2); // ok, the reference is valid
// but
std::string&& s3 = std::move(get_string()); // the reference is
// no longer valid!
// the first assignment — the reference is in
// the std::move argument, its lifetime is limited by the move body
// like any other function that accepts
// and returns the reference (std::move is just an example)
}
Here's a slightly less obvious feature: not only a reference to a temporary object has this effect, but any of its child objects!
#include <iostream>
#include <string>
#include <vector>
struct User {
std::string name;
std::vector<int> tokens;
};
User get_user() {
return {
"Dmitry",
{1,2,3,4,5}
};
}
int main() {
std::string&& name = get_user().name;
// some hacky address arithmetics:
// User is alive, we can access data in it!
// Build with -fsanitize=address to ensure!
auto& v = *(std::vector<int>*)((char*)(&name) + sizeof(std::string));
for (int x : v) {
std::cout << x;
}
}
The code above outputs the contents of the tokens vector from the User object. And there's nothing wrong with it: no dangling references or use-after-free. The reference to a data member extends the lifetime of the whole object. It can be a reference to any nested data member:
struct Name {
std::string name;
};
struct User {
Name name;
std::vector<int> tokens;
};
....
int main() {
std::string&& name = get_user().name.name;
....
}
Nested data members can even exist within arrays! However, arrays should be of good old C-style (T array[N]).
struct Name {
std::string name;
};
struct User {
Name name[2];
std::vector<int> tokens;
};
User get_user() {
return {
{ "Dmitry", "Dmitry" },
{1,2,3,4,5}
};
}
int main() {
std::string&& name = get_user().name[1].name;
...
}
This trick won't work with std::array because of the overloaded operator []:
error: rvalue reference to type 'basic_string<...>' cannot bind to lvalue of type 'basic_string<...>'
23 | std::string&& name = get_user().name[1].name;
Replacing the std::string&& name rvalue reference with const std::string& name helps the code compile and crash with the expected stack-use-after-free:
....
struct User {
std::array<Name, 2> name;
std::vector<int> tokens;
};
....
int main() {
const std::string& name = get_user().name[1].name;
std::cout << name << "\n";
}
Here's the run result:
Program returned: 1
==1==ERROR: AddressSanitizer:
stack-use-after-scope on address0x7e6806200040 at
pc 0x5b1ce93dcf19 bp 0x7ffdc59e7770 sp 0x7ffdc59e7768
READ of size 8 at 0x7e6806200040 thread T0
Great! However, an inquisitive reader has probably already guessed what the issue is. We take reference to only one data member, and we're likely to work only with that data member. The whole object, however, is left to live... What if the rest of its data members hold the allocated memory? What if we really need them to have a destructor called?
To illustrate the issue, I'll give an example not in C++ but in Rust, since the type that causes issues can be taken from the standard library there, just like a beautifully broken syntactic construct.
use parking_lot::Mutex;
#[derive(Default, Debug)]
struct State {
value: u64,
}
impl State {
fn is_even(&self) -> bool {
self.value % 2 == 0
}
fn increment(&mut self) {
self.value += 1
}
}
fn main() {
let s: Mutex<State> = Default::default();
match s.lock().is_even() {
true => {
s.lock().increment(); // oops, double lock!
}
false => {
println!("wasn't even");
}
}
dbg!(&s.lock());
}
This example leads to a deadlock: the temporary LockGuard object in the match statement remains alive because of sheer absurdity! You can learn more about it here. Now let's get back to C++.
If, for some reason, we decide to follow Rust's example and explicitly associate the mutex with data (as it should be 95% of the time), we get the same issue with careless reference usage:
template <class T>
struct Mutex {
T data;
std::mutex _mutex;
explicit Mutex(T data) : data {data} {}
auto lock() {
struct LockGuard {
public:
LockGuard(T& data,
std::unique_lock<std::mutex>&& guard) :
data(data), guard(std::move(guard)) {}
std::reference_wrapper<T> data;
private:
std::unique_lock<std::mutex> guard;
};
return LockGuard(this->data, std::unique_lock{_mutex});
}
};
int main() {
Mutex<int> m {15};
// double lock (deadlock, ub) due to LockGuard
// lifetime extension, remove && and it will be fine
auto&& data = m.lock().data;
std::cout << data.get() << "\n";
auto&& data2 = m.lock().data;
std::cout << data2.get() << "\n";
}
Seasoned C++ advocates may say, "You're your own worst enemy. Why using a reference when there's a reference_wrapper?" And they'd be right, of course. Don't worry, though, C++23 now has the same broken construct, just like match in Rust. This is... range-based-for!
Most surprisingly, the standard has introduced changes to fix the dangling reference in the construct:
for (auto item : get_object().get_container()) { ... }
Now they make it possible to get into the exact same deadlock as in Rust:
template <class T>
struct Mutex {
T data;
std::mutex _mutex;
explicit Mutex(T data) : data {data} {}
auto lock() {
struct LockGuard {
public:
LockGuard(T& data,
std::unique_lock<std::mutex>&& guard) :
data(data), guard(std::move(guard)) {}
std::reference_wrapper<T> data;
T& get() const {
return data.get();
}
private:
std::unique_lock<std::mutex> guard;
};
return LockGuard(this->data, std::unique_lock{_mutex});
}
};
struct User {
std::vector<int> _tokens;
std::vector<int> tokens() const {
return this->_tokens;
}
};
int main() {
Mutex<User> m { { {1,2,3, 4,5} } };
for (auto token: m.lock().get().tokens()) {
std::cout << token << "\n";
m.lock(); // deadlock C++23
}
}
The best part about all this is that, currently, this "fixed" behavior hasn't yet been implemented in mainstream compilers. Soon, however, in about five years, when you update them, many amazing discoveries may await you!
Author: Dmitry Sviridkin
Dmitry has over eight years of experience in high-performance software development in C and C++. From 2019 to 2021, Dmitry Sviridkin has been teaching Linux system programming at SPbU and C++ hands-on courses at HSE. Currently works on system and embedded development in Rust and C++ for edge servers as a Software Engineer at AWS (Cloudfront). His main area of interest is software security.
Editor: Andrey Karpov
Andrey has over 15 years of experience with static code analysis and software quality. The author of numerous articles on writing high-quality code in C++. Andrey Karpov has been honored with the Microsoft MVP award in the Developer Technologies category from 2011 to 2021. Andrey is a co-founder of the PVS-Studio project. He has long been the company's CTO and was involved in the development of the C++ analyzer core. Andrey is currently responsible for team management, personnel training, and DevRel activities.
0