Webinar: Parsing C++ - 10.10
This article contains a very interesting example. The absence of the return statement in a value-returning function leads to undefined behavior. It's a perfect example of how wrong code can crash one day, even though it could work for many years.
We inspect an error pattern that the SEI CERT C++ coding standard describes as MSC52-CPP. Value-returning functions must return a value from all exit paths.
The C++ Standard, [stmt.return], paragraph 2 [ISO/IEC 14882-2014], states the following:
Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function.
A simple example of code with an error:
int foo(T a, T b)
{
if (a < b)
return -1;
else if (a > b)
return 1;
}
The developer forgot to write return 0 if two values are equal. Not all execution branches return the value, and this leads to undefined behavior.
I think everything's clear here. It's a well-known error pattern. We often find this error with the V591 diagnostic in various open-source projects. You can see the examples here.
Well, if everything's clear and errors are found, why did I write this article? Here comes the fun part!
The problem is that developers often interpret undefined behavior a bit different than it really is. Undefined behavior with the forgotten return statement is often interpreted as this: the function returns a random value. Moreover, the developer's previous experience may confirm this.
Wrong. Undefined behavior means we cannot predict what's going to happen. Code that worked correctly may suddenly start working in a different way.
To demonstrate this, I'll show you a slightly edited discussion (RU) from the RSDN website.
A funny crash
Linux, libc-2.33, GCC 11.1.0, optimization -O2, the following code fragment crashes with SIGSEGV:
#include <string>
#include <iostream>
bool foobar(const std::string &s)
{
std::string sx = s;
std::cout << sx << std::endl;
}
int main(int argc, char **argv)
{
foobar(argv[0]);
return 0;
}
/home/user/test$ g++ -O2 -std=c++11 ./test.cpp -o ./test && ./test
./test.cpp: In function 'bool foobar(const string&)':
./test.cpp:8:1: warning: no return statement in function returning non-void [-Wreturn-type]
8 | }
| ^
./test
Segmentation fault (core dumped)
If we change bool foobar to void foobar or add return false, code doesn't crash.
It also doesn't crash if we use GCC 7.5.0.
By the way, std::string, as it turned out, doesn't affect the situation. The analog of this code on C, compiled by g++, also crashes.
#include <stdio.h>
bool foobar(const char *s)
{
printf("foobar(%s)\n", s);
}
int main(int argc, char **argv)
{
foobar(argv[0]);
return 0;
}
/home/user/test$ g++ -O2 ./test.c -o ./test && ./test
./test.c: In function 'int foobar(const char*)':
./test.c:6:1: warning: no return statement in function returning non-void [-Wreturn-type]
6 | }
| ^
foobar(./test)
Segmentation fault (core dumped)
If we write this: gcc -O2 ./test.c -o ./test && ./test, everything's fine.
The compiler just won't generate instruction for returning from the function (ret)!
0000000000001150 <_Z6foobarPKc>:
1150: 48 89 fe mov rsi,rdi
1153: 48 83 ec 08 sub rsp,0x8
1157: 48 8d 3d a6 0e 00 00 lea rdi,[rip+0xea6] # 2004 <_IO_stdin_used+0x4>
115e: 31 c0 xor eax,eax
1160: e8 cb fe ff ff call 1030 <printf@plt>
1165: 66 2e 0f 1f 84 00 00 00 00 00 cs nop WORD PTR [rax+rax*1+0x0]
116f: 90 nop
0000000000001170 <__libc_csu_init>:
1170: f3 0f 1e fa endbr64
1174: 41 57 push r15
Thanks to the ononim user from the RSDN website for a very entertaining example.
A very unusual example of undefined behavior.
What conclusions can be drawn from this? In my opinion there are two of them:
Additional links:
0