Webinar: Parsing C++ - 10.10
This article is the last in our series of articles where we share tips on how to write high-quality code, using the bugs found in the Chromium project as examples. Now, with 6 articles behind, there still remain a lot of errors we haven't discussed yet. They are too diverse to be put into specific categories, so in this 7th article, I'll just pick and talk about the most interesting defects.
As I mentioned in the introductory article, I've read the PVS-Studio report and have found about 250 bugs in the Chromium project and the third-party libraries it uses. Since I only glanced through the report, there are actually much more bugs to be found there.
After the introductory article, I wrote 6 more on various bug patterns. Those articles were rich in examples, but there are still about 70 bugs left, which I can't put into particular groups. Maybe I'm just tired. Well, there's another reason: a report for XNU is waiting for me, which I can't wait to get to.
That's why I decided to finish the series with this final post, where I show you the most interesting of the remaining bugs. As a reminder, here's the complete list of the bugs: chromium.txt.
Chromium project.
void DeviceMediaAsyncFileUtil::CreateOrOpen(
std::unique_ptr<FileSystemOperationContext> context, ....) {
....
CreateSnapshotFile(
std::move(context), url,
base::Bind(
&NativeMediaFileUtil::CreatedSnapshotFileForCreateOrOpen,
base::RetainedRef(context->task_runner()),
file_flags, callback));
}
PVS-Studio diagnostic message: V522 CWE-476 Dereferencing of the null pointer 'context' might take place. device_media_async_file_util.cc 322
Depending on the user's luck, or rather the compiler used, this code may both work well and end up with null-pointer dereference.
Whether the null pointer will be dereferenced depends on the order of evaluation of the arguments when calling to the CreateSnapshotFile function. In C++, the evaluation order of function arguments isn't specified (unspecified behavior). If std::move(context) argument happens to be evaluated first, then null-pointer dereference will occur in context->task_runner().
Tip. Don't try to squeeze as many operations as possible into one line, as this often leads to errors. With a more straightforward style, you have more chances to write code without mistakes.
Chromium project.
std::unordered_map<std::string, int> thread_colors_;
std::string TraceLog::EventToConsoleMessage(....) {
....
thread_colors_[thread_name] = (thread_colors_.size() % 6) + 1;
....
}
PVS-Studio diagnostic message: V708 CWE-758 Dangerous construction is used: 'm[x] = m.size()', where 'm' is of 'unordered_map' class. This may lead to undefined behavior. trace_log.cc 1343
This code is so complicated that you can't even figure out if its behavior is undefined or not. The reason is that the C++ standard is changing and some constructs, once defined as causing undefined behavior, become correct. Here are a few simple examples to explain this:
i = ++i + 2; // undefined behavior until C++11
i = i++ + 2; // undefined behavior until C++17
f(i = -2, i = -2); // undefined behavior until C++17
f(++i, ++i); // undefined behavior until C++17,
// unspecified after C++17
i = ++i + i++; // undefined behavior
cout << i << i++; // undefined behavior until C++17
a[i] = i++; // undefined behavior until C++17
n = ++i + i; // undefined behavior
Getting back to the Chromium code, the thread_colors_[thread_name] expression may create a new element in the container or fail to do so, returning a reference to an already existing element instead. The main point here is that thread_colors_[thread_name] could change the number of elements in the associative container.
The result of the (thread_colors_.size() % 6) + 1 expression depends on the number of elements in the associative container thread_colors_.
You'll get different values depending on which operand of the assignment operator, =, is evaluated first.
What does the evaluation order depend on? It depends on the language version used. But whatever the version, it's a bad idea to write code like that because it's very hard to read.
Tip. It's the same: don't try to squeeze as many operations as possible into one line.
ICU library.
U_DRAFT uint32_t U_EXPORT2 ubiditransform_transform(....)
{
....
const UBiDiAction *action = NULL;
....
if (action + 1) {
updateSrc(....);
}
....
}
PVS-Studio diagnostic message: V694 CWE-571 The condition (action + 1) is only false if there is pointer overflow which is undefined behavior anyway. ubiditransform.cpp 502
The condition is always true. In theory, an overflow could make it false, but this would result in undefined behavior.
WebRTC library.
std::vector<SdpVideoFormat>
StereoDecoderFactory::GetSupportedFormats() const
{
std::vector<SdpVideoFormat> formats = ....;
for (const auto& format : formats) { // <=
if (cricket::CodecNamesEq(....)) {
....
formats.push_back(stereo_format); // <=
}
}
return formats;
}
PVS-Studio diagnostic message: V789 CWE-672 Iterators for the 'formats' container, used in the range-based for loop, become invalid upon the call of the 'push_back' function. stereocodecfactory.cc 89
The analyzer detected invalidation of the iterator in a range-based for loop. The code above is equivalent to this:
for (auto format = begin(formats), __end = end(formats);
format != __end; ++format) {
if (cricket::CodecNamesEq(....)) {
....
formats.push_back(stereo_format);
}
}
Now you can see that, when calling to the push_back function, the iterators format and __end may get invalidated if the storage is reallocated inside the vector.
Tip. Remember that you must not change the number of container elements in range-based, as well as iterator-based, for loops.
Chromium project.
STDMETHOD(GetInputScopes)(InputScope** input_scopes,
UINT* count) override
{
if (!count || !input_scopes)
return E_INVALIDARG;
*input_scopes = static_cast<InputScope*>(CoTaskMemAlloc(
sizeof(InputScope) * input_scopes_.size()));
if (!input_scopes) {
*count = 0;
return E_OUTOFMEMORY;
}
....
}
PVS-Studio diagnostic message: V649 CWE-561 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains function return. This means that the second 'if' statement is senseless. Check lines: 67, 71. tsf_input_scope.cc 71
The second check of the input_scopes pointer doesn't make sense because if it happens to be null, the check in the beginning of the function will notice this and the function will return E_INVALIDARG.
The error here has to do with the missing * operator in "if (!input_scopes)". Because of that, the pointer returned by the CoTaskMemAlloc function is not checked. This is what the code should look like:
*input_scopes = static_cast<InputScope*>(CoTaskMemAlloc(
sizeof(InputScope) * input_scopes_.size()));
if (!*input_scopes) {
*count = 0;
return E_OUTOFMEMORY;
}
Skia library.
SkOpSpan* SkOpContour::undoneSpan() {
SkOpSegment* testSegment = &fHead;
bool allDone = true;
do {
if (testSegment->done()) {
continue;
}
allDone = false;
return testSegment->undoneSpan();
} while ((testSegment = testSegment->next()));
if (allDone) {
fDone = true;
}
return nullptr;
}
PVS-Studio finds this code suspicious for two reasons at once:
It's very suspicious, but I can't figure out how exactly it should work and where the error is. If you wish, you can try to find out for yourself what the undoneSpan function should actually look like.
WebKit library.
WebString StringConstraint::ToString() const {
....
bool first = true;
for (const auto& iter : exact_) {
if (!first)
builder.Append(", ");
builder.Append('"');
builder.Append(iter);
builder.Append('"');
}
....
}
PVS-Studio diagnostic message: V547 CWE-570 Expression '!first' is always false. webmediaconstraints.cpp 302
Since the first variable is always true, no commas will be added between the elements. Correct version:
bool first = true;
for (const auto& iter : exact_) {
if (first)
first = false;
else
builder.Append(", ");
builder.Append('"');
builder.Append(iter);
builder.Append('"');
}
ICU library.
uint32_t CollationDataBuilder::setPrimaryRangeAndReturnNext(....)
{
....
} else {
// Short range: Set individual CE32s.
for(;;) {
utrie2_set32(....);
++start;
primary = Collation::incThreeBytePrimaryByOffset(....);
if(start > end) { return primary; }
}
modified = TRUE; // <=
}
}
PVS-Studio diagnostic message: V779 CWE-561 Unreachable code detected. It is possible that an error is present. collationdatabuilder.cpp 392
The loop can be interrupted only by calling the return statement. It means the assignment operation after the loop will never be executed.
Ced library.
void HzBoostWhack(DetectEncodingState* destatep,
uint8 byte1, uint8 byte2)
{
if ((byte2 == '{') || (byte2 == '}')) {
// Found ~{ or ~}
Boost(destatep, F_HZ_GB_2312, kBoostOnePair);
} else if ((byte2 == '~') || (byte2 == '\n')) {
// neutral
destatep->enc_prob[F_HZ_GB_2312] += 0;
} else {
// Illegal pair
Whack(destatep, F_HZ_GB_2312, kBadPairWhack);
}
}
PVS-Studio diagnostic message: V751 Parameter 'byte1' is not used inside function body. compact_enc_det.cc 2559
It's very suspicious that the byte1 argument is not used in the function. I don't know if this is an error, but even if not, one shouldn't write code like that, as it may confuse both maintainers and compilers.
Sometimes programmers have wrong assumptions about how certain functions or language constructs work. Let's take a look at some of the errors resulting from such assumptions.
Chromium project.
void OnConvertedClientDisconnected() {
// We have no direct way of tracking which
// PdfToEmfConverterClientPtr got disconnected as it is a
// movable type, short of using a wrapper.
// Just traverse the list of clients and remove the ones
// that are not bound.
std::remove_if(
g_converter_clients.Get().begin(),
g_converter_clients.Get().end(),
[](const mojom::PdfToEmfConverterClientPtr& client) {
return !client.is_bound();
});
}
PVS-Studio diagnostic message: V530 CWE-252 The return value of function 'remove_if' is required to be utilized. pdf_to_emf_converter.cc 44
The remove_if function doesn't remove anything but only moves the elements inside the container. The code should probably look like this:
auto trash = std::remove_if(........);
g_converter_clients.Get().erase(trash,
g_converter_clients.Get().end());
V8 engine.
void StringStream::Add(....) {
....
case 'f': case 'g': case 'G': case 'e': case 'E': {
double value = current.data_.u_double_;
int inf = std::isinf(value);
if (inf == -1) {
Add("-inf");
} else if (inf == 1) {
Add("inf");
} else if (std::isnan(value)) {
Add("nan");
} else {
EmbeddedVector<char, 28> formatted;
SNPrintF(formatted, temp.start(), value);
Add(formatted.start());
}
break;
} ....
}
PVS-Studio diagnostic message: V547 CWE-570 Expression 'inf == - 1' is always false. string-stream.cc 149
Here's the description of the std::isinf function: isinf.
As you can see, std::isinf returns a value of type bool, so the way it is used here is obviously incorrect.
Skia library.
GrGLProgram* GrGLProgramBuilder::finalize() {
....
std::unique_ptr<char> binary(new char[length]);
....
}
PVS-Studio diagnostic message: V554 CWE-762 Incorrect use of unique_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. grglprogrambuilder.cpp 272
The storage is allocated by the operator new[] but freed by the operator delete. The unique_ptr class needs a hint as to how to manage memory. Correct version:
std::unique_ptr<char[]> binary(new char[length]);
Another slip-up found in the same library:
GrGLProgram* GrGLProgramBuilder::finalize() {
....
std::unique_ptr<uint8_t> data((uint8_t*) malloc(dataLength));
....
}
PVS-Studio diagnostic message: V554 CWE-762 Incorrect use of unique_ptr. The memory allocated with 'malloc' will be cleaned using 'delete'. grglprogrambuilder.cpp 275
It seems one of the developers discovered the std::unique_ptr class but didn't have enough time to learn how to use it properly :). The storage is allocated by the malloc function but freed by the operator delete.
Correct code:
std::unique_ptr<uint8_t, void (*)(void*)>
data((uint8_t*) malloc(dataLength), std::free);
WebKit library.
struct ScrollAnchorData {
WebString selector_;
WebFloatPoint offset_;
uint64_t simhash_;
ScrollAnchorData(const WebString& selector,
const WebFloatPoint& offset,
uint64_t simhash)
: selector_(selector), offset_(offset), simhash_(simhash) {}
ScrollAnchorData() {
ScrollAnchorData(WebString(), WebFloatPoint(0, 0), 0); }
};
PVS-Studio diagnostic message: V603 CWE-665 The object was created but it is not being used. If you wish to call constructor, 'this->ScrollAnchorData::ScrollAnchorData(....)' should be used. webscrollanchordata.h 49
It's not calling one constructor from another. It's just that one unnamed object is created and immediately destroyed.
Incorrect checks for null pointers are very common in applications. There are two types of this bug:
The first type is when the pointer is first dereferenced and only then checked:
p[n] = 1;
if (!p) return false;
The second type is when the programmer checks the pointer before the first use but forgets to do so before a second use:
if (p) p[0] = x;
p[1] = y;
The former are detected by the V595 diagnostic, the latter by the V1004 diagnostic.
Such defects aren't always that bad. Firstly, some pointers can never become null. In that case, there's no error at all - just an extra check confusing both programmers and code analyzers. Secondly, some pointers may become null only in very rare cases, so the error does not affect the program under normal conditions.
That said, the developers should still pay attention to V595 and V1004 warnings and fix them where necessary. PVS-Studio issued a lot of such messages on the code of Chromium and the libraries. Unfortunately, as I mentioned in the introductory article, they are mostly false positives because of the use of the DCHECK macro. So, I got bored of working through them pretty soon. V595 and V1004 warnings need to be examined more carefully after some tweaking of the analyzer's settings.
In any case, I assure you that there are a lot of bugs there that have to do with incorrect pointer checks. Some of them are cited in chromium.txt. Finding the rest would take a few heroes who could configure the analyzer and study the new report.
I won't cite all the bugs I found, as they all look pretty much the same. Instead, I'll show you just two examples for each diagnostic so you can grasp the idea of what type of errors I'm talking about.
V595, example one, Chromium project.
template <typename T>
void PaintOpReader::ReadFlattenable(sk_sp<T>* val) {
// ....
// Argument val is not used nor checked here.
// ....
val->reset(static_cast<T*>(SkValidatingDeserializeFlattenable(
const_cast<const char*>(memory_), bytes,
T::GetFlattenableType())));
if (!val)
SetInvalid();
....
}
PVS-Studio diagnostic message: V595 CWE-476 The 'val' pointer was utilized before it was verified against nullptr. Check lines: 124, 126. paint_op_reader.cc 124
The val pointer is dereferenced without being checked for nullptr.
V595, example two, Chromium project.
void HttpAuthHandlerRegistryFactory::RegisterSchemeFactory(
const std::string& scheme,
HttpAuthHandlerFactory* factory)
{
factory->set_http_auth_preferences(http_auth_preferences());
std::string lower_scheme = base::ToLowerASCII(scheme);
if (factory)
factory_map_[lower_scheme] = base::WrapUnique(factory);
else
factory_map_.erase(lower_scheme);
}
PVS-Studio diagnostic message: V595 CWE-476 The 'factory' pointer was utilized before it was verified against nullptr. Check lines: 122, 124. http_auth_handler_factory.cc 122
The factory pointer is dereferenced without being checked for nullptr.
V1004, example one, PDFium library.
void CFX_PSRenderer::SetClip_PathStroke(....,
const CFX_Matrix* pObject2Device, ....)
{
....
if (pObject2Device) {
....
}
....
m_ClipBox.Intersect(
pObject2Device->TransformRect(rect).GetOuterRect());
....
}
PVS-Studio diagnostic message: V1004 CWE-476 The 'pObject2Device' pointer was used unsafely after it was verified against nullptr. Check lines: 237, 248. cfx_psrenderer.cpp 248
The pObject2Device pointer could be null, which is indicated by the nullptr check. However, this pointer is dereferenced a few lines later without such a check.
V1004, example two, SwiftShader library.
VertexProgram::VertexProgram(...., const VertexShader *shader)
: VertexRoutine(state, shader),
shader(shader),
r(shader->dynamicallyIndexedTemporaries)
{
....
if(shader && shader->containsBreakInstruction())
{
enableBreak = ....;
}
if(shader && shader->containsContinueInstruction())
{
enableContinue = ....;
}
if(shader->isInstanceIdDeclared())
{
instanceID = ....;
}
}
PVS-Studio diagnostic message: V1004 CWE-476 The 'shader' pointer was used unsafely after it was verified against nullptr. Check lines: 43, 53. vertexprogram.cpp 53
The shader pointer could be null, which is indicated by the nullptr checks. However, the pointer is dereferenced a few lines later without such a check.
We - the PVS-Studio team - are greeting Google developers and would like to say that we are open to cooperation. There are at least two possibilities:
Welcome to try PVS-Studio. Feel free to email us. We will help you in checking your projects and give you a temporary license so that you can test it in full.
Thanks to everyone who has made it through the whole series. I hope you enjoyed it.
As you can see, even such a high-quality project as Chromium has many errors that PVS-Studio can detect. Why don't you start using it with your projects too?
0