Webinar: Evaluation - 05.12
The PVS-Studio headquarters: "Time flies by so fast... This year, on January 2, Blender turned 30 years old! It feels like just yesterday we published an article on checking the project... What do you mean "it's been 8 years"? Let's fix this now!".
Blender is undoubtedly an iconic project that has gained widespread interest and recognition over the years. The tool has been available as an open-source solution since 2000, allowing any artist who is at least a little familiar with it to transform their wildest fantasies into the X-Y-Z-axis reality.
It has a very nice and user-friendly interface that you'd like to work with. The Eevee and Cycles render engines can make anything look pretty. Plus, you can google and install lots of different things if you want even more variety. In fact, you can find many different add-ons for Blender that can satisfy all your needs.
However! There's also a fly in the ointment. Anyone, including myself, who uses Blender to take super expensive courses knows that the tool can be capricious, especially if you don't know what you're doing.
Yeah, a long time ago I had Blender crash a couple times, thank goodness there is an autosave. That's when I realized I had to learn how to use the tool to avoid surprises. The good thing is that YouTube has lots of free courses and videos on how to make this or that, er... thing. As I've said, the project is iconic.
Well, what was I talking about? That's right, bugs! It's not like things crash all the time, no! The program is quite stable: you can work 12–14 hours a day and everything is fine! But every now and then your jagged animation hints that something is wrong. I think anyone who has ever used or uses Blender agrees with me. Especially if you didn't keep track of the Transform parameters (Scale in particular) — based!
Here's the commit I cloned the project on: 76092ac.
This article isn't intended to devalue the work done by the programmers involved in developing the product. I really like it! The goal is to promote the static analyzer, find errors, and send the bug report to the developers. This may help improve the quality of the project code.
Before we look at the code and warnings, I'd like to note that when analyzing one or another warning issued for this project, I wasn't always sure whether it was a bug or a feature. The code is often very confusing, and it takes time to understand it. However, I've compiled a list of errors and analyzed the warnings. It's safe to say that you won't regret the spent time, because it'll be interesting.
So, I present you the most interesting code fragments of Blender, where the PVS-Studio static analyzer detects suspicious snippets.
Buckle up and let's get started!
Inspired by the idea of transparent code, I looked into some erroneous code fragments in Blender and discovered the concept of transparent bugs: "The idea behind transparent bugs is simple: a system is transparent when you can look at its code and understand where the bugs are, what they do, and how they do it".
Fragment N1
void Cryptomatte::begin_sync()
{
const eViewLayerEEVEEPassType enabled_passes =
static_cast<eViewLayerEEVEEPassType>(
inst.film.enabled_passes_get() &
( EEVEE_RENDER_PASS_CRYPTOMATTE_OBJECT |
EEVEE_RENDER_PASS_CRYPTOMATTE_ASSET |
EEVEE_RENDER_PASS_CRYPTOMATTE_ASSET ) );
....
}
The analyzer warning: V501 There are identical sub-expressions 'EEVEE_RENDER_PASS_CRYPTOMATTE_ASSET' to the left and to the right of the '|' operator. eevee_cryptomatte.cc 18
This is an obvious bug: the same EEVEE_RENDER_PASS_CRYPTOMATTE_ASSET constant is used twice when getting a bitmask. Most likely, another constant should be used — EEVEE_RENDER_PASS_CRYPTOMATTE_MATERIAL = (1 << 18).
Fragment N2
void MeshFromGeometry::create_edges(Mesh *mesh)
{
....
for (int i = 0; i < tot_edges; ++i)
{
....
dst_edge[0] = mesh_geometry_.global_to_local_vertices_
.lookup_default(src_edge[0], 0);
dst_edge[1] = mesh_geometry_.global_to_local_vertices_
.lookup_default(src_edge[1], 0);
BLI_assert( dst_edge[0] < total_verts
&& dst_edge[0] < total_verts);
}
....
}
The analyzer warning: V501 There are identical sub-expressions 'dst_edge[0] < total_verts' to the left and to the right of the '&&' operator. obj_import_mesh.cc 266
The left and right subexpressions of the logical "AND" in BLI_assert are exactly the same. Most likely, the second subexpression should check dst_edge[1] instead of dst_edge[0].
Fragment N3
static void get_nearest_fcurve_verts_list (bAnimContext *ac,
const int mval[2],
ListBase *matches)
{
....
filter = (ANIMFILTER_DATA_VISIBLE |
ANIMFILTER_CURVE_VISIBLE |
ANIMFILTER_FCURVESONLY |
ANIMFILTER_NODUPLIS |
ANIMFILTER_FCURVESONLY);
....
}
The analyzer warning: V501 There are identical sub-expressions 'ANIMFILTER_FCURVESONLY' to the left and to the right of the '|' operator. graph_select.cc 178
Using a table for code formatting is a form of art. Everything is obvious in this example!
The concept of bug transparency works.
No need to explain anything.
It's wonderful!
By the way, there aren't many such fragments in the project code. I can only point out that code formatting often solves the problem, and developers can immediately fix transparent bugs.
"Here's an interesting example of null pointer dereferencing when the V522 diagnostic rule is triggered: the dereferencing is only a consequence of an error and not its cause".
Null pointer dereferencing is a fairly common issue. Especially due to logical errors in the program code conditions. How do these bugs work? How can we find them? The analyzer and I have found answers to these questions, as well as some interesting examples for this article.
Fragment N4
static bool gpencil_stroke_eraser_is_occluded (tGPsdata *p, bGPDlayer *gpl,
bGPDspoint *pt, const int x,
const int y)
{
Object *obact = (Object *)p->ownerPtr.data;
Brush *brush = p->brush;
Brush *eraser = p->eraser;
BrushGpencilSettings *gp_settings = nullptr;
if (brush->gpencil_tool == GPAINT_TOOL_ERASE)
{
gp_settings = brush->gpencil_settings;
}
else
if ((eraser != nullptr) & // <=
(eraser->gpencil_tool == GPAINT_TOOL_ERASE))
{
gp_settings = eraser->gpencil_settings;
}
if ((gp_settings != nullptr) &&
(gp_settings->flag & GP_BRUSH_OCCLUDE_ERASER) )
{
RegionView3D *rv3d = static_cast<RegionView3D *>(p->region->regiondata);
....
}
....
return false;
}
The analyzer warning: V522 Dereferencing of the null pointer 'eraser' might take place. Check the bitwise operation. gpencil_paint.cc 1429
As we can see, everything is fine with pointer checking and dereferencing. Unless, of course, we ignore the & symbol that proudly separates the two subexpressions in parentheses. Of course, the developers should have used && (the double ampersand) here, as in the condition below. However, the code is exactly what it is: the bitwise "AND" operator also executes the right subexpression that dereferences a potential null pointer.
By the way, there's another similar fragment in the same file:
Fragment N5
void ui_draw_popover_back (ARegion *region, uiStyle * /*style*/,
uiBlock *block, rcti *rect )
{
....
if (block)
{
float mval_origin[2] = {float(block->bounds_offset[0]),
float(block->bounds_offset[1])};
ui_window_to_block_fl (region, block, &mval_origin[0], &mval_origin[1]);
ui_draw_popover_back_impl (wt->wcol_theme, rect, block->direction,
U.widget_unit / block->aspect, mval_origin);
}
else
{
const float zoom = 1.0f / block->aspect; // <=
wt->state (wt, &STATE_INFO_NULL, UI_EMBOSS_UNDEFINED);
wt->draw_block (&wt->wcol, rect, 0, 0, zoom);
}
....
}
The analyzer warning: V522 Dereferencing of the null pointer 'block' might take place. interface_widgets.cc 5294
This piece of code has a logic issue. If block == nullptr, the else branch is executed, where dereferencing of the block null pointer is guaranteed.
The analyzer has issued warnings for similar code snippets:
"Insanity is doing the exact same thing over and over again expecting things to change... That. Is. Crazy," — Vaas, Far Cry 3.
It's quite symbolic: we have a bug, we try to call a function, but it doesn't get called, and the code is correct. So, what's wrong? We've been looking at the same code for half an hour and can't figure out what's wrong. Miraculous, indeed! Then, we suddenly find an issue where we never thought we'd find it. Errors in condition logic. Because the logic is often overcomplicated, and we don't realize we've done something wrong.
Fragment N6
void BLI_threadpool_init(ListBase *threadbase,
void *(*do_thread)(void *),
int tot)
{
....
if (threadbase != nullptr && tot > 0)
{
....
if (tot > RE_MAX_THREAD)
{
tot = RE_MAX_THREAD;
}
else if (tot < 1) // <=
{
tot = 1;
}
....
}
....
}
The analyzer warning: V547 Expression 'tot < 1' is always false. threads.cc 131
It's simple: if we look at the first condition, we find a check — tot > 0. In this case, tot is of the int type, and if it's greater than 0, it can't be less than 1. As a result, the tot < 1 condition is always false. So, we can simplify the code by removing this check.
Fragment N7
enum
{
ALIGN_WORLD = 0,
ALIGN_VIEW,
ALIGN_CURSOR,
};
bool ED_object_add_generic_get_opts(bool *r_is_view_aligned, ....)
{
....
if (RNA_struct_property_is_set(op->ptr, "rotation"))
{
....
}
else
{
int alignment = ALIGN_WORLD;
PropertyRNA *prop = RNA_struct_find_property(op->ptr, "align");
if (RNA_property_is_set(op->ptr, prop))
{
*r_is_view_aligned = alignment == ALIGN_VIEW;
alignment = RNA_property_enum_get(op->ptr, prop);
}
}
....
}
I'll leave it up to you to figure out what's wrong here.
The analyzer warning: V547 Expression 'alignment == ALIGN_VIEW' is always false. object_add.cc 544
So, what was I talking about? The error may have been caused by a typo, and the murderer is a butler. The r_is_view_aligned variable is of the bool * type, and the check is indeed always false. Also, the expression to the right of the comparison operator == should most likely be different. Or is it intended that way? This is a strange situation, but we've found the issue. So, it shouldn't take long to fix it. Little fish are sweet!
Additional warnings:
Fragment N8
void BKE_gpencil_stroke_copy_settings(const bGPDstroke *gps_src,
bGPDstroke *gps_dst)
{
gps_dst->thickness = gps_src->thickness;
gps_dst->flag = gps_src->flag;
gps_dst->inittime = gps_src->inittime;
gps_dst->mat_nr = gps_src->mat_nr;
copy_v2_v2_short(gps_dst->caps, gps_src->caps);
gps_dst->hardness = gps_src->hardness;
copy_v2_v2(gps_dst->aspect_ratio, gps_src->aspect_ratio);
gps_dst->fill_opacity_fac = gps_dst->fill_opacity_fac; // <=
copy_v3_v3(gps_dst->boundbox_min, gps_src->boundbox_min);
copy_v3_v3(gps_dst->boundbox_max, gps_src->boundbox_max);
gps_dst->uv_rotation = gps_src->uv_rotation;
copy_v2_v2(gps_dst->uv_translation, gps_src->uv_translation);
gps_dst->uv_scale = gps_src->uv_scale;
gps_dst->select_index = gps_src->select_index;
copy_v4_v4(gps_dst->vert_color_fill, gps_src->vert_color_fill);
}
The analyzer warning: V570 The 'gps_dst->fill_opacity_fac' variable is assigned to itself. gpencil_legacy.cc 1029
The variable gps_dst->fill_opacity_fac is assigned to itself. Maybe this is how it should be? Nope. Most likely, there is a typo and another variable should be at the right of the assign expression. Something like this: gps_src->fill_opacity_fac. Is it a copy-paste error?
Fragment N9
static int gizmo_cage2d_modal(....)
{
....
if ((transform_flag & ED_GIZMO_CAGE_XFORM_FLAG_SCALE_UNIFORM) == 0)
{
const bool use_temp_uniform = (event->modifier & KM_SHIFT) != 0;
const bool changed = data->use_temp_uniform != use_temp_uniform;
data->use_temp_uniform = data->use_temp_uniform;
....
}
....
}
The analyzer warning: V570 The 'data->use_temp_uniform' variable is assigned to itself. cage2d_gizmo.cc 1094
Here we have a copy-paste error yet again. This is probably what it should look like:
data->use_temp_uniform = use_temp_uniform;
Fragment N10
static void space_text_update_drawcache(SpaceText *st,
const ARegion *region)
{
....
if (st->wordwrap)
{
....
if (drawcache->update)
{
drawcache->valid_tail = drawcache->valid_head = 0;
....
memmove(new_tail, old_tail, drawcache->valid_tail);
....
}
....
}
....
}
V575 The 'memmove' function processes '0' elements. Inspect the third argument. text_draw.cc 673
It's strange that we try to copy 0 bytes of information from one memory space to another.
Fragment N11
void ui_but_value_set(uiBut *but, double value)
{
....
if (but->editval) {
value = *but->editval = value;
}
else
....
ui_but_update_select_flag(but, &value);
}
The analyzer warning: V570 The same value is assigned twice to the 'value' variable. interface.cc 2676
Do you think it was designed that way, or is it also a typo? I'll leave the matter without any closure. Like in bad horror movies, you know.
"Cthulhu fhtagn, my friends! Indeed, who else but the Sleeping One could be the God of undefined behavior? Be careful though, as you don't want to wake him up too early, do you? The code contains enough chaos as it is!"
Fragment N12
static void rigidbody_update_ob_array(RigidBodyWorld *rbw)
{
if (rbw->group == nullptr)
{
rbw->numbodies = 0;
rbw->objects = static_cast<Object **>(realloc(rbw->objects, 0)); // <=
return;
}
....
}
V575 The 'realloc' function processes '0' elements. Inspect the second argument. rigidbody.cc 1696
Since the C23 standard, if the memory size for a reallocated memory space is zero, the behavior is considered undefined. By the way, such a pattern is considered obsolete starting with C17.
Fragment N13
void BKE_vfont_build_char(....)
{
....
BezTriple *bezt2 = (BezTriple *)MEM_malloc_arrayN(u,
sizeof(BezTriple),
"duplichar_bezt2" );
....
for (int i = nu2->pntsu; i > 0; i--)
{
float *fp = bezt2->vec[0];
fp[0] = (fp[0] + ofsx) * fsize;
fp[1] = (fp[1] + ofsy) * fsize;
fp[3] = (fp[3] + ofsx) * fsize;
fp[4] = (fp[4] + ofsy) * fsize;
fp[6] = (fp[6] + ofsx) * fsize;
fp[7] = (fp[7] + ofsy) * fsize;
bezt2++;
}
....
}
The analyzer warning: V557 Array overrun is possible. The '7' index is pointing beyond array bound. vfont.cc 612
I also present to you the following array:
typedef struct BezTriple
{
float vec[3][3];
....
}
In this example, the analyzer has issued quite a few warnings for each assignment starting with fp[3]. The bezt2->vec[3][3] two-dimensional array is treated as if it were one-dimensional. Some programmers still believe that they're doing everything right, and that there are no issues with accessing sequential elements of a two-dimensional array.
However, the latest and most powerful compiler technologies (we call them optimizations) for C and C++ give the "Programmer of the Year" award in the "Undefined Behavior Ignorance" category to those who think that using two-dimensional arrays as one-dimensional is a good idea.
Some time ago, my colleague, Anton Tretyakov, published an article called "Oh my C! How they wrote code back in the Quake days". It had a similar example that explained exactly why using multidimensional arrays as one-dimensional is wrong. Take your time to read the article — it's worth it. My colleague Phillip Khandeliants — aka one of my mentors and the guy who knows almost everything — reproduced the undefined behavior issue. It's also worth to take a look at.
Well, we've tested and proved everything. Undefined behavior occurs due to various optimizations. My advice to everyone — don't do that. Let's move on.
"The main problem when searching for uninitialized variables is that there's so much code that the train of thought about where the variable and its data are used gets lost in parallel calculations and changes in the code, and if the data is also constantly passed from function to function unchanged, and then somewhere out there maybe something happens to it, and the variable is still initialized... or not".
Anyway, if you lost the train of thought a few times while reading the top paragraph, that's okay. In fact, it's the same as searching for code fragments that use uninitialized variables. Especially if there's a lot of code. It's not a pleasant experience. So, the handiness of the analyzer has been proven in the real life. This is a fairly common case: there's a lot of code, it takes you half a day to find a bug, but you should've found it yesterday.
Fragment N14
static void gpencil_convert_spline(....)
{
....
float init_co[3];
switch (nu->type) {
case CU_POLY:
{
....
}
case CU_BEZIER:
{
....
}
case CU_NURBS:
{
if (nu->pntsv == 1)
{
....
gpencil_add_new_points (gps, coord_array, 1.0f, 1.0f, 0,
gps->totpoints, init_co, false); // <=
....
}
default:
{
break;
}
}
The analyzer warning: V614 Uninitialized buffer 'init_co' used. Consider checking the seventh actual argument of the 'gpencil_add_new_points' function. gpencil_curve_legacy.cc 439
So, here's a quick one: we found init_co, also known as float init_co[3].
If we check the whole function, in which this array variable is declared, up to the point where it's passed to gpencil_add_new_points(), we don't see that it's initialized with any value. Random data. Ok, is it perhaps initialized within this function? Let's take a look:
static void gpencil_add_new_points(....,
const float init_co[3],
const bool last)
{
BLI_assert(totpoints > 0);
....
for (int i = 0; i < totpoints; i++)
{
....
if ((last) && (i > 0) && (i == totpoints - 1))
{
float dist = len_v3v3(init_co, &pt->x);
....
}
....
}
}
Our mini-array is now passed to the len_v3v3 function, and it hasn't been filled with any values yet. It's random again.
We go into len_v3v3 and search for init_co, which is a[3] now:
static __forceinline float len_v3v3(const float a[3], const float b[3])
{
float d[3];
sub_v3_v3v3(d, b, a);
return len_v3(d);
}
We go even deeper into sub_v3_v3v3 for a[3], which is now b[3]:
Static __forceinline void sub_v3_v3v3(float r[3], const float a[3],
const float b[3] )
{
r[0] = a[0] - b[0];
r[1] = a[1] - b[1];
r[2] = a[2] - b[2];
}
The search of initialization with data was unsuccessful and the error is there. Most importantly, we found an uninitialized variable, and now there's a chance that the devs will fix the issue.
Additional warnings:
Fragment N15
typedef struct Point2Struct
{
double coordinates[2];
Point2Struct() { .... }
....
} Point2;
typedef Point2 Vector2;
using BezierCurve = Vector2 *;
static BezierCurve GenerateBezier(Vector2 *d,
int first, int last,
double *uPrime,
Vector2 tHat1, Vector2 tHat2)
{
....
BezierCurve bezCurve; /* RETURN bezier curve control points. */
....
bezCurve = (Vector2 *)malloc(4 * sizeof(Vector2)); // <=
....
if (alpha_l < 1.0e-6 || alpha_r < 1.0e-6)
{
....
bezCurve[0] = d[first]; // <=
bezCurve[3] = d[last]; // <=
....
}
....
}
The analyzer warning: V630 The 'malloc' function is used to allocate memory for an array of objects which are classes containing constructors. FitCurve.cpp 129
The malloc function is used to create an array with four Point2 objects (Vector2 is an alias of Point2). However, the Point2 class contains a non-trivial default constructor that zeroes the coordinates array. As a result, the code contains undefined behavior: at the time of the assignment, the lifetime of these objects hasn't yet begun.
Currently, compilers don't use this feature for clever optimizations. However, things may change in the future :)
How can we fix this? The best option is to use the operator new[]. However, the code is a mix of C and C++ languages. Its logic will surely be broken if we start using the operator new[].
Moreover, using the operators new and delete has long been unfashionable and even dangerous due to such a human trait as inattention. Smart pointers is the obvious answer. It looks like the issue is resolved and we can call it a day. However, the use of smart pointers also requires some code rewriting.
What other options do we have? There's a way to fix everything without rewriting anything:
static BezierCurve GenerateBezier(Vector2 *d,
int first, int last,
double *uPrime,
Vector2 tHat1, Vector2 tHat2)
{
....
BezierCurve bezCurve; /* RETURN bezier curve control points. */
....
bezCurve = (Vector2 *)malloc(4 * sizeof(Vector2));
std::uninitialized_default_construct_n(bezCurve, 4); // <=
....
if (alpha_l < 1.0e-6 || alpha_r < 1.0e-6)
{
....
bezCurve[0] = d[first];
bezCurve[3] = d[last];
....
}
....
}
We've resolved the undefined behavior issue regarding the object lifetime. You can read more on this here.
Regarding C++20. A careful reader may be familiar with the P0593R6 proposal, which strictly defines the program behavior starting with C++20.
Fragment N16
But first, since the case isn't simple and is very confusing, take a look at the analyzer warning:
V762 It is possible a virtual function was overridden incorrectly. See first argument of function 'preferred_domain' in derived class 'HandlePositionFieldInput' and base class 'CurvesFieldInput'. node_geo_input_curve_handles.cc 95
So, we can see from the warning message that the issue lies in incorrect overriding of a virtual function in arbitrary classes. What exactly is wrong? Let's find out.
We'll start with the base class:
typedef struct CurvesGeometry { .... };
namespace bke
{
....
class CurvesGeometry : public ::CurvesGeometry { .... };
class CurvesFieldInput : public fn::FieldInput
{
....
virtual std::optional<AttrDomain> preferred_domain(
const CurvesGeometry &curves) const;
};
....
}
The preferred_domain virtual function takes a parameter of the bke::CurvesGeometry type. Let's keep that in mind.
Now, take a look at the child:
namespace blender::nodes::node_geo_input_curve_handles_cc
{
class HandlePositionFieldInput final : public bke::CurvesFieldInput
{
....
std::optional<AttrDomain> preferred_domain(
const CurvesGeometry & /*curves*/) const;
};
}
Have you found an issue? If you haven't, don't worry, I didn't understand it at first either :) Let's look into it.
In the base class, the virtual function accepts a parameter with an unqualified name — CurvesGeometry. When the compiler searches for this type, it starts with the scope of the CurvesFieldInput class and looks in all framed scopes until it finds the type. As a result, the bke::CurvesGeometry type is found.
Now let's look at the derived classes. They're defined in a different namespace from the one in which the base class is located. The compiler also starts searching for the needed CurvesGeometry name, doesn't find it in the framed scopes, and reaches the global one. And the global namespace also has CurvesGeometry, just not the one we need to override the function :)
We can fix it by specifying a qualified type name. Let's use the C++11 (override) capabilities and protect future generations from errors:
namespace blender::nodes::node_geo_input_curve_handles_cc
{
class HandlePositionFieldInput final : public bke::CurvesFieldInput
{
....
std::optional<AttrDomain> preferred_domain(
const bke::CurvesGeometry & /*curves*/) const override;
};
}
There are also two other classes nearby that contain the same error:
I have mixed feelings about the check. On the one hand, we encountered many additional defensive measures — often clever, crafty, and overall beautiful. On the other hand, even the PVS-Studio analyzer sometimes can't understand the cause-and-effect relations of this code labyrinth where the floor is lava.
I hope you learned something interesting or even helpful from the analyzer warnings, code errors, and our examination of code fragments.
As usual, I suggest you trying our PVS-Studio analyzer. We offer a free license for open-source projects.
Take care and have a great day!
A little bonus for those who read to the end =).
"Banishing the god of undefined behavior(Cthulhu-bug)"
0