Fellow developers, we invite you to continue our exciting journey through the depths of Intel OpenVINO code! Equipped with a static analyzer, just like detectives, we'll uncover the most insidious and interesting bugs, and reveal the secrets of hidden errors and tricky typos left outside the first part of the article.
So, in the first part we've looked at many enlightening typos in the code of the OpenVINO project. They are enlightening because typos will always happen in a developer's life—you can't escape them—but the issue can be resolved. Using the static analysis technology that alerts us to such errors, we can create certain statistics about where a programmer makes these mistakes more often. Once these patterns are identified, fighting typos becomes much more effective. And if you're wondering where the most common typos are, we have an interesting article worth reading.
However, imagine if, in the blink of an eye we could fully focus on code without getting distracted. Then our code would be perfect, and typos would disappear completely. Let's keep on dreaming... If we only could foresee all possible scenarios of code usage, consider all interconnections between its components, and anticipate every error... The software we developed would've worked perfectly. However, I know only one "programming language" where all this is possible, and that's HTML.
We're still talking about C++, though. When dealing with it, even the most experienced programmer can make a mistake, hmm... well, really, anywhere and everywhere. So, now that we've divided all the warnings into "before" and "after" and dealt with typos, let's look at other, no less interesting and diverse errors.
The article purpose isn't to devalue the work of programmers involved in the development of this product. Its purpose is to popularize static code analyzers that are useful even for high-quality and established projects. Moreover, we offer free licenses for open-source projects (and not just for those). You can learn more here.
In fact, the analyzer didn't find that many errors. All the interesting and important errors fit into just a couple of articles. The code is quite beautiful and secure. The commit I was using to build and test the project is still the same: 2d8ac08.
Well, now we're done with the formalities, so let's get back to looking at the errors. As expected, the bug complexity increases, and the intricacy of the code we have to figure out increases as well. Some parts of the code are quite strange. It's not clear how or why it's written that way, but I suggest you start with some simple things, and then... Well, you'll see soon enough.
Fragment N1
template <typename T>
void ROIAlignForward(....)
{
int64_t roi_cols = 4; // <=
int64_t n_rois = nthreads / channels / pooled_width / pooled_height;
// (n, c, ph, pw) is an element in the pooled output
for (int64_t n = 0; n < n_rois; ++n)
{
....
if (roi_cols == 5) // <=
{
roi_batch_ind = static_cast<int64_t>(offset_bottom_rois[0]);
offset_bottom_rois++;
}
....
}
....
}
The analyzer warning: V547 Expression 'roi_cols == 5' is always false. experimental_detectron_roi_feature_extractor.cpp 211
Checking if (roi_cols == 5) really always returns false, and the code in the body is unreachable. This is because the value of the roi_cols variable doesn't change in any way between when it's declared and when it's checked in the condition.
Fragment N2
bool is_valid_model(std::istream& model)
{
// the model usually starts with a 0x08 byte
// indicating the ir_version value
// so this checker expects at least 3 valid ONNX keys
// to be found in the validated model
const size_t EXPECTED_FIELDS_FOUND = 3u;
std::unordered_set<::onnx::Field, std::hash<int>> onnx_fields_found = {};
try
{
while (!model.eof() && onnx_fields_found.size() < // <=
EXPECTED_FIELDS_FOUND )
{
const auto field = ::onnx::decode_next_field(model);
if (onnx_fields_found.count(field.first) > 0)
{
// if the same field is found twice, this is not a valid ONNX model
return false;
}
else
{
onnx_fields_found.insert(field.first);
::onnx::skip_payload(model, field.second);
}
}
return onnx_fields_found.size() == EXPECTED_FIELDS_FOUND;
}
catch (...)
{
return false;
}
}
The analyzer warning: V663 Infinite loop is possible. The 'cin.eof()' condition is insufficient to break from the loop. Consider adding the 'cin.fail()' function call to the conditional expression. onnx_model_validator.cpp 168
The analyzer detected a rather interesting and rare bug that can cause a loop to become infinite.
When working with objects of the std::istream class, the !model.eof() check may not be enough to exit a while loop. If a failure occurs while reading data, all subsequent calls to the eof function return only false, which may result in an infinite loop.
To avoid this issue, we can use the overloaded operator bool in the loop condition as follows:
while (model && onnx_fields_found.size() < EXPECTED_FIELDS_FOUND)
{
....
}
Fragment N3
NamedOutputs pad3d(const NodeContext& node)
{
....
// padding of type int feature only supported by paddle 'develop'
// version(>=2.1.0)
if (node.has_attribute("paddings")) // <=
{
auto paddings_vector = node.get_attribute<
std::vector<int32_t>
>("paddings");
PADDLE_OP_CHECK(node, paddings_vector.size() == 6,
"paddings Params size should be 6 in pad3d!");
paddings = paddings_vector;
}
else if (node.has_attribute("paddings")) // <=
{
auto padding_int = node.get_attribute<int32_t>("paddings");
for (int i = 0; i < 6; i++)
paddings[i] = padding_int;
}
else
{
PADDLE_OP_CHECK(node, false, "Unsupported paddings attribute!");
}
....
}
The analyzer warning: V517 [CERT-MSC01-C] The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 22, 26. pad3d.cpp 22
The first and second conditions of the if constructs are the same, so the code in the then-branch of the second if is always unreachable. You may notice that the branches have different logic: the first one tries to read the paddings attribute as a vector of the int32_t (paddings_vector) number type, while the second one tries to read the same attribute as a number of int32_t type (padding_int).
It's hard to define what exactly the correct code should look like in this case. However, let's take a guess. The code is in the OpenVINO Paddle Frontend module, which parses the model generated by the PaddlePaddle framework. If we search for the 'pad3d' name in the project, we can find the following description:
Parameters:
padding (Tensor|list[int]|tuple[int]|int): The padding size with data type
``'int'``. If is ``'int'``, use the same padding in all dimensions.
Else [len(padding)/2] dimensions of input will be padded.
The pad has the form (pad_left, pad_right, pad_top,
pad_bottom, pad_front, pad_back).
This suggests that padding is an option, and we need to visit two interesting alternatives: std::vector<int32_t> and int32_t. We can do this in the following way:
auto paddings = std::vector<int32_t>(6, 0);
if (node.has_attribute("paddings"))
{
auto paddings_attr = node.get_attribute_as_any("paddings");
if (paddings_attr.is<std::vector<int32_t>>())
{
auto paddings_vector = paddings_attr.as<std::vector<int32_t>>();
PADDLE_OP_CHECK(node, paddings_vector.size() == 6,
"paddings Params size should be 6 in pad3d!");
paddings = std::move(paddings_vector);
}
else if (paddings_attr.is<int32_t>())
{
auto padding_int = paddings_attr.as<int32_t>();
if (padding_int != 0)
{
std::fill(paddings.begin(), paddings.end(), padding_int);
}
}
}
Also, while looking at the PaddlePaddle sources, I got an idea that there's a typo in the attribute. So, it should be called padding, not paddings. But I'm not completely sure :) In any case, I recommend developers to pay attention to this code.
Fragment N4
template <typename T>
std::basic_string<T> get_model_extension() {}
The analyzer warning: V591 [CERT-MSC52-CPP] Non-void function should return a value. graph_iterator_flatbuffer.hpp 29
As we can see, the function has an empty body and returns nothing, even though the return type is specified as std::basic_string<T>. Well, hello, undefined behavior.
This code fragment looks like it came straight from The X-Files series, but it's actually quite simple. If we jump a few lines down, we can see the specializations of this function template:
template <>
std::basic_string<char> get_model_extension<char>();
#if defined(OPENVINO_ENABLE_UNICODE_PATH_SUPPORT) \
&& defined(_WIN32)
template <>
std::basic_string<wchar_t> get_model_extension<wchar_t>();
#endif
And these specializations have specific bodies that return specific objects:
template <>
std::basic_string<char>
ov::frontend::tensorflow_lite::get_model_extension<char>()
{
return ::tflite::ModelExtension();
}
#if defined(OPENVINO_ENABLE_UNICODE_PATH_SUPPORT) \
&& defined(_WIN32)
template <>
std::basic_string<wchar_t> ov::frontend::
tensorflow_lite::get_model_extension<wchar_t>()
{
return util::string_to_wstring(::tflite::ModelExtension());
}
#endif
So, the program works properly with specializations of char and wchar_t but does whatever it wants with others. And there are three more: char8_t, char16_t, char32_t.
Yes, the code may not use these template specializations, but we're all for safe and secure code. And we'd like the compiler to stop us at the compilation stage when we're dealing with such code. It's very easy to do, we just need to turn the function template definition into a declaration:
template <typename T>
std::basic_string<T> get_model_extension();
Now, when we try to call a specialization of char8_t, char16_t, or char32_t, the compiler throws an error because it can't do the required instantiation without a body.
Here are a few more such warnings:
Fragment N5
template <typename TReg>
int getFree(int requestedIdx)
{
if (std::is_base_of<Xbyak::Mmx, TReg>::value)
{
auto idx = simdSet.getUnused(requestedIdx);
simdSet.setAsUsed(idx);
return idx;
}
else if ( std::is_same<TReg, Xbyak::Reg8>::value
|| std::is_same<TReg, Xbyak::Reg16>::value
|| std::is_same<TReg, Xbyak::Reg32>::value
|| std::is_same<TReg, Xbyak::Reg64>::value)
{
auto idx = generalSet.getUnused(requestedIdx);
generalSet.setAsUsed(idx);
return idx;
}
else if (std::is_same<TReg, Xbyak::Opmask>::value)
{
return getFreeOpmask(requestedIdx);
}
}
The analyzer warning: V591 [CERT-MSC52-CPP] Non-void function should return a value. registers_pool.hpp 229
Here's another snippet with a function that doesn't return a value in all of the execution paths. If you think that nothing will happen to your program if you forget return in a function, you're wrong. Even if the return type is built-in, the code contains undefined behavior. I also recommend reading this article about the cruel joke the compiler can play on you in this situation.
Let's go back to our example. Starting with C++17, we can enhance this code. Firstly, as you can see, conditions are compile-time expressions. So, let's use the if constexpr construct. It discards compiling the code in the branches where the condition is false. Secondly, we can use static_assert to protect the code from specializations for which no value is returned in the current code:
template <typename TReg>
int getFree(int requestedIdx)
{
if constexpr (std::is_base_of<Xbyak::Mmx, TReg>::value) { .... }
....
else
{
// until C++23
static_assert(sizeof(TReg *) == 0, "Your message.");
// since C++23
static_assert(false, "Your message.");
}
}
The fixed example provides two ways to write static_assert, depending on the version of the standard you are using. This is because prior to C++23, the static_assert(false, "...") option in a function template always resulted in a compile-time error.
If you work with versions prior to C++17, you can fix the code by adding overloads to the getFree function template and using std::enable_if:
template <typename TReg,
std::enable_if_t< std::is_base_of<Xbyak::Mmx, TReg>::value,
int > = 0>
int getFree(int requestedIdx)
{
auto idx = simdSet.getUnused(requestedIdx);
simdSet.setAsUsed(idx);
return idx;
}
template <typename TReg,
std::enable_if_t< std::is_same<TReg, Xbyak::Reg8>::value
|| std::is_same<TReg, Xbyak::Reg16>::value
|| std::is_same<TReg, Xbyak::Reg32>::value
|| std::is_same<TReg, Xbyak::Reg64>::value,
int > = 0>
int getFree(int requestedIdx)
{
auto idx = generalSet.getUnused(requestedIdx);
generalSet.setAsUsed(idx);
return idx;
}
template <typename TReg,
std::enable_if_t< std::is_same<TReg, Xbyak::Opmask>::value,
int > = 0>
int getFree(int requestedIdx)
{
return getFreeOpmask(requestedIdx);
}
Fragment N6
template <>
void RandomUniform<x64::avx512_core>::initVectors()
{
....
if (m_jcp.out_data_type.size() <= 4)
{
static const uint64_t n_inc_arr[8] = { 0, 1, 2, 3, 4, 5, 6, 7 };
mov(r64_aux, reinterpret_cast<uintptr_t>(n_inc_arr));
}
else
{
static const uint64_t n_inc_arr[8] =
{ 0, 1, 2, 3, 4, 5, 6, 7 }; // TODO: i64
mov(r64_aux, reinterpret_cast<uintptr_t>(n_inc_arr));
}
....
}
The analyzer warning: V523 The 'then' statement is equivalent to the 'else' statement. random_uniform.cpp 120
This snippet is simple but unclear. The same code is contained in the body of both the if and else branches. Perhaps this is a copy-paste error, and developers need to pay attention to this code fragment.
Fragment N7
inline ParseResult parse_xml(const char* file_path)
{
....
try
{
auto xml = std::unique_ptr<pugi::
xml_document>{new pugi::xml_document{}};
const auto error_msg = [&]() -> std::string {....}();
....
return {std::move(xml), error_msg};
}
catch (std::exception& e)
{
return {std::move(nullptr), // <=
std::string("Error loading XML file: ") + e.what()};
}
}
The analyzer warning: V575 [CERT-EXP37-C] The null pointer is passed into 'move' function. Inspect the first argument. xml_parse_utils.hpp 249
To be honest, we've never seen such an interesting code fragment before: nullptr passed to the std::move function. It's not prohibited to pass nullptr to std::move, the function just converts it to std::nullptr_t &&. However, it's unclear why this was done.
To better understand what's going on, let's take a look at ParseResult:
struct ParseResult
{
ParseResult(std::unique_ptr<pugi::xml_document>&& xml, std::string error_msg)
: xml(std::move(xml)),
error_msg(std::move(error_msg)) {}
std::unique_ptr<pugi::xml_document> xml;
std::string error_msg{};
};
Let's play detectives and solve the mystery of where such code could've come from. Most likely, a programmer wrote return in the try-block first: the ParseResult object is constructed there by moving the xml smart pointer and copying error_msg. Then they handled the situation when an exception would be thrown. In this case, an object of the ParseResult type should also be returned. To make their life easier, they copied the previous return and slightly modified the constructor arguments. When they saw the xml smart pointer moving, they decided that they needed to move something here as well. The nullptr pointer, for example.
However, there's no need for that. According to C++ rules, when you choose an overload of a constructor, the compiler should perform some implicit conversions on the passed arguments. For example, the rvalue reference to std::unique_ptr<pugi::xml_document> can't be bound to the rvalue reference to std::nullptr_t. So, the compiler creates a temporary object of the std::unique_ptr<pugi::xml_document> type by calling the appropriate converting constructor. Only then a reference (constructor parameter) is bound to this temporary object.
If we omit std::move and pass nullptr to the constructor, the code also compiles and becomes more readable:
return { nullptr, std::string("Error loading XML file: ") +
e.what() };
And here we seamlessly move on to working with pointers. And the analyzer helps understand the cases where things went wrong, and how one shouldn't code in general.
Fragment N8
void GraphOptimizer::FuseFCAndWeightsDecompression(Graph &graph)
{
....
// Fusion processing
auto *multiplyInputNode = dynamic_cast<node::
Input *>(multiplyConstNode.get());
if (!multiplyInputNode)
{
OPENVINO_THROW("Cannot cast ", multiplyInputNode->getName(), // <=
" to Input node.");
}
fcNode->fuseDecompressionMultiply(multiplyInputNode->getMemoryPtr());
if (withSubtract)
{
auto *subtractInputNode = dynamic_cast<node::
Input *>(subtractConstNode.get());
if (!subtractInputNode)
{
OPENVINO_THROW("Cannot cast ", subtractInputNode->getName(), // <=
" to Input node.");
}
fcNode->fuseDecompressionSubtract(subtractInputNode->getMemoryPtr());
}
if (withPowerStatic)
{
auto *eltwiseNode = dynamic_cast<node::
Eltwise *>(powerStaticNode.get());
if (!eltwiseNode)
{
OPENVINO_THROW("Cannot cast ", eltwiseNode->getName(), // <=
" to Eltwise node.");
}
}
....
}
The analyzer warnings:
I often see such errors in various projects, so I decided to pay attention to them. The thing is that programmers rarely test error handlers :)
Let's imagine for a moment that the dynamic_cast result returns a null pointer. Then, when an exception is thrown, the same null pointer is used in an attempt to call the getName function. We get undefined behavior that can cause the exception handling to turn into a critical error for the program.
Fragment N9
void Defer(Task task)
{
auto &stream = *(_streams.local()); // <=
stream._taskQueue.push(std::move(task));
if (!stream._execute)
{
stream._execute = true;
try
{
while (!stream._taskQueue.empty())
{
Execute(stream._taskQueue.front(), stream);
stream._taskQueue.pop();
}
}
catch (...)
{
}
stream._execute = false;
}
}
The analyzer warning: V758 The 'stream' reference becomes invalid when smart pointer returned by a function is destroyed. cpu_streams_executor.cpp 410
In fact, this example is fine and the reference isn't "dangling". This happens because the shared_ptr that the local function returns is pre-stored in the _stream_map container, which has a longer lifetime than the reference:
class CustomThreadLocal : public ThreadLocal<std::shared_ptr<Stream>>
{
....
public:
std::shared_ptr<Stream> local()
{
....
if (stream == nullptr)
{
stream = std::make_shared<Impl::Stream>(_impl);
}
auto tracker_ptr = std::make_shared<
CustomThreadLocal::ThreadTracker
>(id);
t_stream_count_map[(void*)this] = tracker_ptr;
auto new_tracker_ptr = tracker_ptr->fetch();
_stream_map[new_tracker_ptr] = stream; // <=
return stream;
}
private:
....
std::map<std::shared_ptr<CustomThreadLocal::ThreadTracker>,
std::shared_ptr<Impl::Stream>> _stream_map;
....
};
However, in the analyzer's defense, it's better to store the result of the local function in a variable in the Defer function. Let's imagine that the code has changed a bit. The following things begin to happen:
By storing the result in the variable, we extend the lifetime of the object before it leaves scope. So, we eliminate a potential issue:
void Defer(Task task)
{
auto local = _streams.local();
auto &stream = *local;
....
}
Fragment N10
~DIR()
{
if (!next)
delete next;
next = nullptr;
FindClose(hFind);
}
The analyzer warning: V575 [CERT-EXP37-C] The null pointer is passed into 'operator delete'. Inspect the argument. w_dirent.h 94
When you look at this code, you may not be able to see at first what the error is. The thing is, the memory at the next pointer isn't freed because the check was written incorrectly. The analyzer indirectly informs about it when it sees a null pointer being passed to the operator delete. Also, there's no point in checking the pointer at all because the operator delete correctly handles null pointers.
Here's a code example that is almost exactly the same as the one my colleague gave in the article: "Simple, yet easy-to-miss errors in code". I recommend you to take a look at it. If you've read the article before and thought that the code given there was synthetic and couldn't happen in real life, here's a direct proof for you :)
The fixed code:
~DIR()
{
delete next;
next = nullptr;
FindClose(hFind);
}
Fragment N11
void ov_available_devices_free(ov_available_devices_t* devices)
{
if (!devices)
{
return;
}
for (size_t i = 0; i < devices->size; i++)
{
if (devices->devices[i])
{
delete[] devices->devices[i];
}
}
if (devices->devices)
delete[] devices->devices;
devices->devices = nullptr;
devices->size = 0;
}
The analyzer warning: V595 The 'devices->devices' pointer was utilized before it was verified against nullptr. Check lines: 271, 274. ov_core.cpp 271
First of all, let's see what the ov_available_devices_t type is:
typedef struct {
char** devices; //!< devices' name
size_t size; //!< devices' number
} ov_available_devices_t;
As you can see from the name, the function frees an object of the ov_available_devices_t type passed to it. Except the developer made one mistake in doing so.
First, each pointer in the devices->devices array is freed in the loop. It's implied that the pointer to the array is always non-null. The developer then doubted this, and after the loop, decided to test it before passing to the operator delete[]. Except that he forgot about this assumption in the loop. As a result, we have dereferencing of a potentially null pointer.
Here's the fixed code:
void ov_available_devices_free(ov_available_devices_t* devices)
{
if (!devices || !devices->devices)
{
return;
}
for (size_t i = 0; i < devices->size; i++)
{
delete[] devices->devices[i];
}
delete[] devices->devices;
devices->devices = nullptr;
devices->size = 0;
}
As you may notice, I've deleted all pointer checks before passing them to the operator delete[], since it knows how to handle them.
The analyzer also detected several other similar fragments:
Fragment N12
char* str_to_char_array(const std::string& str)
{
std::unique_ptr<char> _char_array(new char[str.length() + 1]); // <=
char* char_array = _char_array.release(); // <=
std::copy_n(str.c_str(), str.length() + 1, char_array);
return char_array;
}
As promised in the beginning—as the cherry on top—here's the code that may cause a reviewer to ask, "What are you?".
We have a converter function that copies std::string to the dynamically allocated char* buffer. The buffer is created using the operator new[], whose ownership is passed to the smart pointer. Overall, it's a pretty good strategy to wrap a raw pointer, because if something goes wrong, a smart pointer takes care of everything.
Note, however, that the std::unique_ptr<char> specialization of the smart pointer is used. Its destructor frees the passed resource using the operator delete. In fact, that's what the analyzer warns us about:
V554 Incorrect use of unique_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. ov_core.cpp 14
The right thing to do in such cases is to use the std::unique_ptr<char[]> specialization:
char* str_to_char_array(const std::string& str)
{
std::unique_ptr<char[]> _char_array(new char[str.length() + 1]);
....
}
P.S. The reader may object that there's nothing wrong here, since the next line gives the resource ownership to the returned object. Indeed, it does. However, it's still a "code smell", and the issue is that other developers may want to use this smart pointer declaration elsewhere.
Unfortunately, the multiplication process had already begun:
If an exception-throwing operation occurs between the resource acquisition and its passing to the raw pointer ownership, we get undefined behavior.
P.P.S. If the project uses the C++14 standard and later, you can replace the code with the following:
char* str_to_char_array(const std::string& str)
{
auto _char_array = std::make_unique<char[]>(str.length() + 1);
char* char_array = _char_array.release();
....
}
Firstly, we eliminate the explicit use of the operator new[] in the code, shifting everything to std::make_unique. Secondly, auto infers the correct specialization of the smart pointer based on the initializer.
However, such code, in addition to dynamic allocation, also fills the array with zeros. Since the array is completely overwritten, we can save resources by not initializing it. Since C++20, the std::make_unique_for_overwrite function is available for this purpose:
char* str_to_char_array(const std::string& str)
{
auto _char_array = std::make_unique_for_overwrite<char[]>(
str.length() + 1
);
char* char_array = _char_array.release();
....
}
Some of the examples from the OpenVINO project really surprised me. I think they surprised you just as much. However, as I've written before, all the important errors that the project devs should've been paying attention to are packed into a couple of articles. Most of them are typos, which is a common issue among programmers that can occur in almost any project. OpenVINO is an impressive project, and having so few (in my opinion) serious errors only means that the code is quite well written.
I hope you were as interested as I was to look at these code fragments and review the errors they contain.
Of course, we've sent all the information to the developers and hopefully they'll fix the bugs in the near future.
And as a tradition, I recommend you to try our PVS-Studio analyzer. We offer a free license for open-source projects.
Take care and have a good day!
0