V1103. The values of padding bytes are unspecified. Comparing objects with padding using 'memcmp' may lead to unexpected result.
The analyzer has detected a code fragment where structure objects containing padding bytes are compared.
Take a look at a synthetic example:
struct Foo
{
unsigned char a;
int i;
};
void bar()
{
Foo obj1 { 2, 1 };
Foo obj2 { 2, 1 };
auto result = std::memcmp(&obj1, &obj2, sizeof(Foo)); // <=
}
Let's consider the memory layout of 'C' class objects to understand the core of the issue:
[offset 0] unsigned char
[offset 1] padding byte
[offset 2] padding byte
[offset 3] padding byte
[offset 4] int, first byte
[offset 5] int, second byte
[offset 6] int, third byte
[offset 7] int, fourth byte
To handle objects in memory correctly and efficiently, the compiler applies data alignment. For typical data models, the 'unsigned char' type alignment is 1 and the 'int' type alignment is 4. So, the address of the 'Foo::i' data member should be a multiple of 4. To do this, the compiler adds 3 padding bytes after the 'Foo::a' data member.
The C and C++ standards do not specify whether the padding bytes are zeroed out when the object is initialized. Therefore, if you try to compare two objects with the same data member values byte by byte using the 'memcmp' function, the result may not always be 0.
There are several ways to fix the issue.
Method N1 (preferred). Write a comparator and use it to compare objects.
For C:
struct Foo
{
unsigned char a;
int i;
};
bool Foo_eq(const Foo *lhs, const Foo *rhs)
{
return lhs->a == rhs->a && lhs->i == rhs->i;
}
For C++:
struct Foo
{
unsigned char a;
int i;
};
bool operator==(const Foo &lhs, const Foo &rhs) noexcept
{
return lhs.a == rhs.a && lhs.i == rhs.i;
}
bool operator!=(const Foo &lhs, const Foo &rhs) noexcept
{
return !(lhs == rhs);
}
Starting with C++20, we can simplify the code by requesting the compiler to generate the comparator itself:
struct Foo
{
unsigned char a;
int i;
auto operator==(const Foo &) const noexcept = default;
};
Method N2. Zero out objects beforehand.
struct Foo
{
unsigned char a;
int i;
};
bool Foo_eq(const Foo *lhs, const Foo *rhs)
{
return lhs->a == rhs->a && lhs->i == rhs->i;
}
void bar()
{
Foo obj1;
memset(&obj1, 0, sizeof(Foo));
Foo obj2;
memset(&obj2, 0, sizeof(Foo));
// initialization part
auto result = Foo_eq(&obj1, &obj2);
}
However, this method has disadvantages.
- Calling 'memset' introduces the overhead for zeroing out the entire memory area.
- Before calling 'memcmp', we should make sure that the memory for the object is zeroed out. This is easy to forget in a project with a complex control flow.
This diagnostic is classified as:
|