Our website uses cookies to enhance your browsing experience.
Accept
to the top
close form

Fill out the form in 2 simple steps below:

Your contact information:

Step 1
Congratulations! This is your promo code!

Desired license type:

Step 2
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement
close form
Request our prices
New License
License Renewal
--Select currency--
USD
EUR
* By clicking this button you agree to our Privacy Policy statement

close form
Free PVS‑Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

close form
To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

close form
I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

close form
check circle
Message submitted.

Your message has been sent. We will email you at


If you do not see the email in your inbox, please check if it is filtered to one of the following folders:

  • Promotion
  • Updates
  • Spam

Webinar: Evaluation - 05.12

>
>
>
Examples of errors that PVS-Studio foun…

Examples of errors that PVS-Studio found in LLVM 15.0

Oct 25 2022
Author:

Compilers are evolving: they issue more and more warnings. Do developers still need to use static code analyzers like PVS-Studio? Yes, because analyzers are evolving too. In this article you'll see how PVS-Studio can find bugs even in a compiler.

1003_LLVM_15/image1.png

Purpose of this article

Sometimes developers wonder: is it worth trying some additional code quality control tools or will a compiler/IDE be enough? Do such tools have the upper hand against, let's say, "-Wall"? Should they integrate tools like PVS-Studio in their development process?

These are all fair questions. Compilers evolve - they learn to find new bugs and gradually learn to do more things that only static analyzers did before.

The answer is yes. It is worth using additional static analysis tools. These tools evolve as well and learn to perform deeper code analysis. Third-party tools solve a specific task: they search for errors and potential vulnerabilities. Therefore, they solve this issue better than compilers, for which this is only one of the functions.

However, this is only theoretical. How are the things in reality?

To answer this question confidently, our team analyzes compilers' code from time to time. If we can find errors in their code, it means that PVS-Studio is still a necessary and in-demand tool for those who care about the quality of their code. Some of these articles are:

We skipped LLVM 14 but decided to write an article about LLVM 15.

Wikipedia says: LLVM is a set of compiler and toolchain technologies used to develop a front end for any programming language and a back end for any instruction set architecture. It's designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The project's website: llvm.org.

During the development of LLVM, Clang's diagnostic abilities are used to prevent errors. LLVM is checked with Clang Static Analyzer and clang-tidy. Moreover, LLVM is checked with Coverity, a proprietary analyzer.

If a static analyzer finds even a few errors in LLVM, this is a great achievement. And PVS-Studio did find them!

Project analysis, warnings, false positives

LLVM is a big project. Therefore, if we do not fine-tune PVS-Studio first, it will issue lots of false positives or not-so-useful warnings. Since I don't need to implement PVS-Studio into a development process, I didn't configure the analyzer and just reviewed the "raw" report.

This is a bad way to work with the analyzer. However, my main goal was to write this article. And I nailed it:). The correct way to start using the analyzer is described in this article: "How to introduce a static code analyzer in a legacy project and not to discourage the team".

So, the report had some false positives. I expected that — every analyzer issues them. I was curious about something else. I saw a lot of warnings that cannot be called false. The analyzer's right. But it is difficult to call these messages useful either. Let's take a closer look at this.

A lot of ambiguous warnings are associated with conditions that are always true or false. Because of this, I almost abandoned reviewing V547 — Expression is always true/false. It's tedious and boring. Let's look at a couple of examples to see what I mean.

int L = -1, V = -1, T = -1;
if (L != -1 && V != -1 && T != -1 && Name.empty()) {  // <=
  if (!Strict) {
    Buffer.append("HANGUL SYLLABLE ");
    if (L != -1)                                      // <=
      Buffer.append(HangulSyllables[L][0]);
    if (V != -1)                                      // <=
      Buffer.append(HangulSyllables[V][1]);
    if (T != -1)                                      // <=
      Buffer.append(HangulSyllables[T][2]);
  }

Here the analyzer issues 3 warnings at once:

  • V547 [CWE-571] Expression 'L != - 1' is always true. UnicodeNameToCodepoint.cpp 306
  • V547 [CWE-571] Expression 'V != - 1' is always true. UnicodeNameToCodepoint.cpp 308
  • V547 [CWE-571] Expression 'T != - 1' is always true. UnicodeNameToCodepoint.cpp 310

Indeed, the L, V, and T variables are checked twice. Double checks are redundant, but it's not an error.

Another example:

while (top_reader_sp) {
  if (!top_reader_sp)
    break;
  top_reader_sp->Run();

PVS-Studio warning: V547 [CWE-570] Expression '!top_reader_sp' is always false. Debugger.cpp 1042

The conditional operator doesn't make sense. This looks like the consequences of many edits and refactoring. The code was repeatedly rewritten and here's the result. Yes, we can call it atavism but not an error.

There were lots of such code fragments in the project. Maybe this happens because the project is big and many things change a lot. As a result, we see such artifacts. Perhaps we can call it redundant code. From this point of view, all analyzer warnings are still useful since they simplify the code.

42 errors in LLVM

Why 42? Because "the answer to the ultimate question of life, the universe, and everything is 42". Writing more error examples was boring. Nobody's going to read a long report anyway. Except for, perhaps, the LLVM developers... But it's more useful for them to set up the analyzer, check the project themselves and inspect the warnings.

Fragment N1

A classic error similar to those described in the article "The evil within the comparison functions". Of course, this is not exactly a comparison function. However, the point is the same. There are two objects on which checks are performed. This function is quite boring for a thorough code review — that's why an error hides within it. Fortunately, a static analyzer can notice it. The analyzer never gets tired or lazy :).

static bool isCopyCompatibleType(LLT SrcTy, LLT DstTy) {
  if (SrcTy == DstTy)
    return true;

  if (SrcTy.getSizeInBits() != DstTy.getSizeInBits())
    return false;

  SrcTy = SrcTy.getScalarType();
  DstTy = DstTy.getScalarType();

  return (SrcTy.isPointer() && DstTy.isScalar()) ||
         (DstTy.isScalar() && SrcTy.isPointer());
}

PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions '(SrcTy.isPointer() && DstTy.isScalar())' to the left and to the right of the '||' operator. CallLowering.cpp 1198

The error is here:

(SrcTy.isPointer() && DstTy.isScalar()) ||
(DstTy.isScalar()  && SrcTy.isPointer())

The developer simultaneously changed Src to Dst and Pointer to Scalar in the second line. As a result, we get the same check twice! The code above is equivalent to:

(SrcTy.isPointer() && DstTy.isScalar()) ||
(SrcTy.isPointer() && DstTy.isScalar())

The correct check:

(SrcTy.isPointer() && DstTy.isScalar()) ||
(SrcTy.isScalar()  && DstTy.isPointer())

Fragment N2

Another related error.

static ValueKnowledge meet(const ValueKnowledge &lhs,
                           const ValueKnowledge &rhs) {
  ValueKnowledge result = getPessimisticValueState();
  result.hasError = true;

  if (!rhs || !rhs || lhs.dtype != rhs.dtype)   // <=
    return result;

  result.hasError = false;
  result.dtype = lhs.dtype;
  ....
}

PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions to the left and to the right of the '||' operator: !rhs ||!rhs ShapeUtils.h 141

The rhs object is checked twice, and the lhs object is not checked at all. The usual rush when writing code and the lack of thorough code review after that.

Fragment N3

std::optional<parser::MessageFixedText> Scope::SetImportKind(ImportKind kind) {
  ....
  } else if (kind != *importKind_ &&
      (kind != ImportKind::Only || kind != ImportKind::Only)) {
    return 
      "Every IMPORT must have ONLY specifier if one of them does"_err_en_US;
  } else {
  ....
}

PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions 'kind != ImportKind::Only' to the left and to the right of the '||' operator. scope.cpp 275

It's pointless to perform the same check twice. This is obviously a typo. Some other named constant should have been used instead of ImportKind::Only.

Note. If the expression had been framed in the form of a table, and not in one line, then the error would have been easier to notice. See chapter 13. Table-style formatting" :) I'd write the code like this:

} else if (kind != *importKind_ &&
    (   kind != ImportKind::Only
     || kind != ImportKind::Only)) {
  return ....;

Fragment N4

Since I have nothing to do with the development of LLVM, the next piece of code is a mystery to me. What I mean is that although I can see the code is erroneous, I have a hard time thinking of a correct version for it.

bool Merger::maybeZero(unsigned e) const {
  if (tensorExps[e].kind == kInvariant) {
    if (auto c = tensorExps[e].val.getDefiningOp<complex::ConstantOp>()) {
      ArrayAttr arrayAttr = c.getValue();
      return arrayAttr[0].cast<FloatAttr>().getValue().isZero() &&
             arrayAttr[0].cast<FloatAttr>().getValue().isZero();
    }
    if (auto c = tensorExps[e].val.getDefiningOp<arith::ConstantIntOp>())
      return c.value() == 0;
    if (auto c = tensorExps[e].val.getDefiningOp<arith::ConstantFloatOp>())
      return c.value().isZero();
  }
  return true;
}

PVS-Studio warning: V501 [CWE-571] There are identical sub-expressions 'arrayAttr[0].cast < FloatAttr > ().getValue().isZero()' to the left and to the right of the '&&' operator. Merger.cpp 812

The suspicious fragment:

return arrayAttr[0].cast<FloatAttr>().getValue().isZero() &&
       arrayAttr[0].cast<FloatAttr>().getValue().isZero();

Fragment N5

The last error detected by the V501 diagnostic rule.

template <int KIND>
Expr<Type<TypeCategory::Logical, KIND>> FoldIntrinsicFunction(....)
{
  ....
  } else if (name == "__builtin_ieee_support_datatype" ||
      name == "__builtin_ieee_support_denormal" ||
      name == "__builtin_ieee_support_divide" ||     // <=
      name == "__builtin_ieee_support_divide" ||     // <=
      name == "__builtin_ieee_support_inf" ||
      name == "__builtin_ieee_support_io" ||
      name == "__builtin_ieee_support_nan" ||
      name == "__builtin_ieee_support_sqrt" ||
      name == "__builtin_ieee_support_standard" ||
      name == "__builtin_ieee_support_subnormal" ||
      name == "__builtin_ieee_support_underflow_control") {
    return Expr<T>{true};
  }
  ....
}

PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions 'name == "__builtin_ieee_support_divide"' to the left and to the right of the '||' operator. fold-logical.cpp 218

I marked two identical checks with comments. Most likely one of them is just redundant. I don't think this is a real error. However, there is a small chance that they forgot to compare name with some other string literal here.

Fragment N6

First, note that bmap is a pointer.

struct isl_coalesce_info {
  isl_basic_map *bmap;
  ....
};

Now look at the redundant code that pointlessly checks this class member:

static isl_stat tab_insert_divs(struct isl_coalesce_info *info, ....)
{
  ....
  if (any) {
    if (isl_tab_rollback(info->tab, snap) < 0)
      return isl_stat_error;
    info->bmap = isl_basic_map_cow(info->bmap);
    info->bmap = isl_basic_map_free_inequality(info->bmap, 2 * n);

    if (info->bmap < 0)                 // <=
      return isl_stat_error;

    return fix_constant_divs(info, n, expanded);
  }
  ....
}

PVS-Studio warning: V503 [CWE-697, CERT-EXP08-C] This is a nonsensical comparison: pointer < 0. isl_coalesce.c 3181

It's pointless to perform the "pointer < 0" check. Most likely, it's a typo or the code is not finished yet. For example, some member of the isl_basic_map struct should be compared with 0.

Fragment N7

There's a rather rare error. As of now, we've met only two such bugs in open-source projects.

void FunctionLoweringInfo::ComputePHILiveOutRegInfo(const PHINode *PN) {
  ....
  Register DestReg = It->second;
  if (DestReg == 0)
    return
  assert(Register::isVirtualRegister(DestReg) &&
         "Expected a virtual reg");
  LiveOutRegInfo.grow(DestReg);
  ....
}

PVS-Studio warning: V504 [CWE-841] It is highly probable that the semicolon ';' is missing after 'return' keyword. FunctionLoweringInfo.cpp 454

It's a cool bug, although it's not scary. There is no semicolon after the return statement. As a result, the code does not work as it looks.

Seems like assert is executed anyways, even if the DestReg == 0 condition is false. In fact, assert is executed only when DestReg == 0.

The error is not terrible only because of luck. Actually, assert does not have a noticeable effect on the execution of the program. And in the release version, it turns into nothing at all.

Fragment N8, N9

I haven't studied the history behind this code fragment. But I suppose that once the malloc function was used here, and then it was roughly replaced with new and shared_ptr.

std::shared_ptr<uint8_t>
RenderScriptRuntime::GetAllocationData(....) {
  ....
  const uint32_t size = *alloc->size.get();
  std::shared_ptr<uint8_t> buffer(new uint8_t[size]);
  if (!buffer) {
    LLDB_LOGF(log, "%s - couldn't allocate a %" PRIu32 " byte buffer",
              __FUNCTION__, size);
    return nullptr;
  }
  ....
  return buffer;
}

PVS-Studio warning: V554 [CWE-762, CERT-MEM51-CPP] Incorrect use of shared_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. RenderScriptRuntime.cpp 2371

The std::shared_ptr smart pointer is used incorrectly:

std::shared_ptr<uint8_t> buffer(new uint8_t[size]);

The memory is freed via the delete operator instead of delete[] which leads to UB. For more details, see "Why do arrays have to be deleted via delete[] in C++". The correct code:

std::shared_ptr<uint8_t []> buffer(new uint8_t[size]);

Let me explain why I assumed that malloc used to be here. Look at the subsequent check of the pointer for nullptr. It's meaningless — in case of the memory allocation failure the std::bad_alloc exception will be thrown. The check and the debugging message output is atavism.

There's another warning: V554 [CWE-762, CERT-MEM51-CPP] Incorrect use of shared_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. RenderScriptRuntime.cpp 2698

Fragment N10

Oh, I haven't seen such a remarkable error in a long time! Definitely it's a gem of a typo!

1003_LLVM_15/image2.png
OwningMemRef &operator=(const OwningMemRef &&other) {
  freeFunc = other.freeFunc;
  descriptor = other.descriptor;
  other.freeFunc = nullptr;
  memset(0, &other.descriptor, sizeof(other.descriptor));
}

PVS-Studio warning: V575 [CWE-628, CERT-EXP37-C] The null pointer is passed into 'memset' function. Inspect the first argument. MemRefUtils.h 194

I described various ways how you can make an error using the memset function in the article "The most dangerous function in the C/C++ world". But this is something new!

Here, the first and the second argument of the function are mixed up. Here, the data is written at the null pointer.

Actually, it won't happen at all. The code won't compile — the function expects to take int as the second actual argument, but receives a pointer. The code is in a class template. And while the function is not used, it is not instantiated. The compiler doesn't notice that the code is incorrect. In the case of templates, the compiler may not issue warnings for the incorrect code as long as it's not used.

In general, it's okay but it's still an error that is better be fixed in advance.

Fragment N11-N17

The most common issue among all that we've found in open-source projects is dereferencing a pointer before checking. At least, our error database has the most entries of issues of this kind. Source code of LLVM is no exception.

void LibCallSimplifier::classifyArgUse(....) {
  CallInst *CI = dyn_cast<CallInst>(Val);
  Module *M = CI->getModule();

  if (!CI || CI->use_empty())
    return;
  ....
}

PVS-Studio warning: PVS-Studio: V595 [CWE-476, CERT-EXP12-C] The 'CI' pointer was utilized before it was verified against nullptr. Check lines: 2515, 2517. SimplifyLibCalls.cpp 2515

The CI pointer is first dereferenced during the CI->getModule() function call. And only after that it's checked for nullptr.

Another example:

void Module::FindFunctions(....) {
  ....
  for (size_t i = 0; i < num_matches; ++i) {
    sc.symbol = symtab->SymbolAtIndex(symbol_indexes[i]);
    SymbolType sym_type = sc.symbol->GetType();
    if (sc.symbol && (sym_type == eSymbolTypeCode ||
                      sym_type == eSymbolTypeResolver))
      sc_list.Append(sc);
  }
  ....
}

PVS-Studio warning: V595 [CWE-476, CERT-EXP12-C] The 'sc.symbol' pointer was utilized before it was verified against nullptr. Check lines: 877, 878. Module.cpp 877

To shorten the expression, the developer decided to pre-save a character type in the sym_type variable:

SymbolType sym_type = sc.symbol->GetType();

The thing is, they did this without considering that the sc.symbol pointer may be null. We can see this from the next check:

if (sc.symbol && (sym_type == eSymbolTypeCode ||
                  sym_type == eSymbolTypeResolver))

I'm not interested in inspecting such errors further. I didn't even inspect all the V595 warnings.

  • V595 [CWE-476, CERT-EXP12-C] The 'sc.symbol' pointer was utilized before it was verified against nullptr. Check lines: 899, 900. Module.cpp 899
  • V595 [CWE-476, CERT-EXP12-C] The 'process' pointer was utilized before it was verified against nullptr. Check lines: 159, 184. IRExecutionUnit.cpp 159
  • V595 [CWE-476, CERT-EXP12-C] The 'localVarCst' pointer was utilized before it was verified against nullptr. Check lines: 77, 96. AffineStructures.cpp 77
  • V595 [CWE-476, CERT-EXP12-C] The 'Class' pointer was utilized before it was verified against nullptr. Check lines: 58, 60. Transforms.cpp 58
  • V595 [CWE-476, CERT-EXP12-C] The 'CurPPLexer' pointer was utilized before it was verified against nullptr. Check lines: 739, 743. PPDirectives.cpp 739
  • Perhaps, there are more of them. I didn't look closely. It is not interesting to study them, and even more so to write about them, since they are of the same type.

Fragment N18

Optional<SmallVector<ReassociationIndices>>
mlir::getReassociationIndicesForCollapse(ArrayRef<int64_t> sourceShape,
                                         ArrayRef<int64_t> targetShape) {
  unsigned sourceDim = 0;
  ....
  int64_t currTargetShape = targetShape[targetDim];
  while (sourceShape[sourceDim] != ShapedType::kDynamicSize &&
         prodOfCollapsedDims * sourceShape[sourceDim] < currTargetShape &&
         sourceDim < sourceShape.size()) {
    prodOfCollapsedDims *= sourceShape[sourceDim];
    currIndices.push_back(sourceDim++);
  }
  ....
}

PVS-Studio warning: V781 [CWE-20, CERT-API00-C] The value of the 'sourceDim' index is checked after it was used. Perhaps there is a mistake in program logic. ReshapeOpsUtils.cpp 49

Potential array index out of bounds. Look closely at this loop condition:

sourceShape[sourceDim] != ShapedType::kDynamicSize && .... &&
sourceDim < sourceShape.size()

The index is checked before the array element is accessed. The code below looks safer and more logical:

sourceDim < sourceShape.size() &&
sourceShape[sourceDim] != ShapedType::kDynamicSize && ....

Fragment N19-N36

Unexpectedly, there are many dangerous dereferencing of pointers in class constructors. The pointer is dereferenced when the class members are initialized. Then this pointer is checked for nullptr in the constructor body. Example:

typedef std::shared_ptr<lldb_private::ValueObject> ValueObjectSP;

lldb_private::formatters::LibCxxMapIteratorSyntheticFrontEnd::
    LibCxxMapIteratorSyntheticFrontEnd(lldb::ValueObjectSP valobj_sp)
    : SyntheticChildrenFrontEnd(*valobj_sp), m_pair_ptr(), m_pair_sp() {
  if (valobj_sp)
    Update();
}

PVS-Studio warning: V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 211, 212. LibCxx.cpp 211

  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 381, 382. LibCxx.cpp 381
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 551, 552. LibCxx.cpp 551
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 635, 636. LibCxx.cpp 635
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 49, 50. LibCxxInitializerList.cpp 49
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 70, 71. LibCxxSpan.cpp 70
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 311, 312. LibCxxList.cpp 311
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 207, 208. LibCxxMap.cpp 207
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 71, 72. LibCxxVector.cpp 71
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 174, 176. LibCxxVector.cpp 174
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 57, 59. LibCxxUnorderedMap.cpp 57
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 80, 82. LibStdcpp.cpp 80
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 183, 185. LibStdcpp.cpp 183
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 354, 355. LibStdcpp.cpp 354
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 464, 465. NSArray.cpp 464
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 608, 610. NSArray.cpp 608
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 404, 405. NSSet.cpp 404
  • V664 [CWE-476, CERT-EXP34-C] The 'valobj_sp' pointer is being dereferenced on the initialization list before it is verified against null inside the body of the constructor function. Check lines: 671, 673. NSSet.cpp 671

Such code doesn't always cause troubles. Perhaps null pointers are simply never passed to constructors. However, it's still impossible to call the code correct and good.

An attentive reader may object that passing a null pointer is not always a problem. The SyntheticChildrenFrontEnd constructor stores a reference to the object. If we pass nullptr, there will be a reference to a non-existent object. However, if we don't interact with this reference in any way, there will be no troubles. Not nice, not bad either.

No, we can't rely on such an assumption. Null pointer dereferencing is UB. And we cannot predict how it may manifest. For example, the compiler sees that the pointer is dereferenced. The compiler assumes that the pointer is not null. Thus, it can delete the if (valobj_sp) check for the sake of optimization. Read more about it here: "Null pointer dereferencing causes undefined behavior". And here: "What every C programmer should know about undefined behavior #2/3".

Fragment N37

uint64_t NullabilityPayload = 0;

static constexpr const unsigned NullabilityKindMask = 0x3;

void addTypeInfo(unsigned index, NullabilityKind kind) {
  ....
  NullabilityPayload &=
    ~(NullabilityKindMask << (index * NullabilityKindSize));
  ....
}

PVS-Studio warning: V784 [CWE-197, CERT-INT31-C] The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. Types.h 529

The highest bits in the NullabilityPayload variable will be lost. Let me explain how it happens.

The NullabilityKindMask variable is represented by the unsigned 32-bit type. No matter how many bits we shift in this variable, the result will still be of the unsigned type. The bitwise NOT operator (~) will also not change the type of the expression.

And only when the &= operator is executed, the unsigned type will be extended to the uint64_t type. Since the type was unsigned, then the highest bits will always be zero. As a result, the following operation is performed:

NullabilityPayload &= 0x00000000xxxxxxxx;

This will reset the highest bits in NullabilityPayload. The correct code:

NullabilityPayload &=
  ~(static_cast<uint64_t>(NullabilityKindMask)
     << (index * NullabilityKindSize));

Another option to fix the code is to make the NullabilityKindMask 64-bit:

static constexpr const uint64_t NullabilityKindMask = 0x3;
....
NullabilityPayload &=
  ~(NullabilityKindMask << (index * NullabilityKindSize));

Fragment N38

It's entertaining to notice an error in a test and realize that it isn't testing what was intended. This is a good example when static analysis complements unit tests and the TDD methodology. Nobody writes tests for tests :). Fortunately, a static analyzer helps find errors not only in main code but in tests as well.

TEST(RegisterContextMinidump, ConvertMinidumpContext_x86_64) {
  MinidumpContext_x86_64 Context;
  ....
  Context.rax = 0x0001020304050607;
  Context.rbx = 0x08090a0b0c0d0e0f;
  ....
  Context.eflags = 0x88898a8b;
  Context.cs = 0x8c8d;
  Context.fs = 0x8e8f;
  Context.gs = 0x9091;
  Context.ss = 0x9293;
  Context.ds = 0x9495;
  Context.ss = 0x9697;
  llvm::ArrayRef<uint8_t> ContextRef(reinterpret_cast<uint8_t *>(&Context),
                                     sizeof(Context));
  ....
}

PVS-Studio warning: V519 [CWE-563, CERT-MSC13-C] The 'Context.ss' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 110, 112. RegisterContextMinidumpTest.cpp 112

Look at these code lines:

Context.ss = 0x9293;
....
Context.ss = 0x9697;

A typo. Apparently, Context.es should have been written in one place.

Fragment N39, 40

At the beginning of this article I mentioned that I wouldn't deal with some warnings like V547. However, I reviewed them a little and noticed this error. This error was detected by the data flow analysis technology that takes into account the possible values of variables at different points in program. If you want to know more about the analyzer's work, you can read "PVS-Studio: static code analysis technology".

static std::unordered_multimap<char32_t, std::string>
loadDataFiles(const std::string &NamesFile, const std::string &AliasesFile) {
  ....
  auto FirstSemiPos = Line.find(';');
  if (FirstSemiPos == std::string::npos)                   // <=
    continue;
  auto SecondSemiPos = Line.find(';', FirstSemiPos + 1);
  if (FirstSemiPos == std::string::npos)                   // <=
    continue;
  ....
}

PVS-Studio warning: V547 [CWE-570] Expression 'FirstSemiPos == std::string::npos' is always false. UnicodeNameMappingGenerator.cpp 46

Most likely the code was written via copy-paste. The following fragment was copied:

auto FirstSemiPos = Line.find(';');
if (FirstSemiPos == std::string::npos)
  continue;

Then the First was replaced with the Second, but not everywhere. As a result, the following line

if (FirstSemiPos == std::string::npos)

remains unchanged. Since this condition was checked above, the analyzer reports that the condition is always false. Such errors can be overlooked during code review, but PVS-Studio can easily detect them.

Here's the same error again:

Status Host::ShellExpandArguments(ProcessLaunchInfo &launch_info) {
  ....
  auto data_sp = StructuredData::ParseJSON(output);
  if (!data_sp) {                                        // <=
    error.SetErrorString("invalid JSON");
    return error;
  }

  auto dict_sp = data_sp->GetAsDictionary();
  if (!data_sp) {                                        // <=
    error.SetErrorString("invalid JSON");
    return error;
  }
  ....
}

PVS-Studio warning: V547 [CWE-570] Expression '!data_sp' is always false. Host.cpp 248

Fragment N41

Data flow analysis was used to detect two previous errors. For the next error, the analyzer used a more interesting technology: symbolic execution. The point is, we don't necessarily need to know the possible values of variables. To determine that the condition is true/false, we can solve the equation.

void Sema::adjustMemberFunctionCC(QualType &T, bool IsStatic, bool IsCtorOrDtor,
                                  SourceLocation Loc) {
  ....
  CallingConv CurCC = FT->getCallConv();
  CallingConv ToCC = Context.getDefaultCallingConvention(IsVariadic, !IsStatic);

  if (CurCC == ToCC)
    return;
  ....
  CallingConv DefaultCC = 
    Context.getDefaultCallingConvention(IsVariadic, IsStatic);

  if (CurCC != DefaultCC || DefaultCC == ToCC)
    return;
  ....
}

PVS-Studio warning: V560 [CWE-570] A part of conditional expression is always false: DefaultCC == ToCC. SemaType.cpp 7856

Let's figure out why the analyzer decided that the right part of the condition is always false. For convenience, let's remove all unnecessary things and replace the names:

A = ....;
B = ....;
if (A == B) return;
C = ....;
if (A != C || C == B)

Let's inspect how the code works:

  • there are 3 variables (A, B, C), and we don't know their values;
  • if we know that A == B, we exit the function;
  • if the function continues to execute, then A != B;
  • if A != C, then according to short-circuit evaluation, the right subexpression is not evaluated.
  • if the right subexpression C == B is evaluated, then A == C;
  • if A != B and A == C, then C cannot be equal to B in any way.

Perhaps, there's a typo in the condition, and the following expression should be written:

if (CurCC != DefaultCC || DefaultCC != ToCC)

Fragment N42

I think the previous fragment required a lot of attention, so the last one is a simple bug.

ObjectFileMachO::MachOCorefileAllImageInfos
ObjectFileMachO::GetCorefileAllImageInfos() {
  ....
  lldb::offset_t offset = MachHeaderSizeFromMagic(m_header.magic);
  ....
  uint32_t imgcount = m_data.GetU32(&offset);
  uint64_t entries_fileoff = m_data.GetU64(&offset);
  offset += 4; // uint32_t entries_size;
  offset += 4; // uint32_t unused;
  offset = entries_fileoff;                  // <=
  ....
}

PVS-Studio warning: V519 [CWE-563, CERT-MSC13-C] The 'offset' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 6888, 6890. ObjectFileMachO.cpp 6890

It's strange to calculate something in the offset variable and then write a completely different value. Perhaps it should be written like this:

offset += 4; // uint32_t entries_size;
offset += 4; // uint32_t unused;
offset += entries_fileoff;

Download and try PVS-Studio

If you actively use compiler warnings, this is very good. The next step is to implement a static analyzer — PVS-Studio, for example. It will help you detect even more errors at the code writing stage. Try PVS-Studio.

You can also subscribe to our newsletter so as not to miss new interesting articles and keep up with new PVS-Studio features.



Comments (0)

Next comments next comments
close comment form