Webinar: C++ semantics - 06.11
GTK is a popular open-source framework for creating graphical user interfaces. The project is interesting to check with the PVS-Studio analyzer. Especially since the last time we checked it was about 3 years ago. So, we'll probably find new errors in it.
I would prefer not to repeat the introduction from the previous article: "Finding typos in the GTK 4 project by PVS-Studio". However, I guess that not all of our readers are familiar with GTK. I'll make it short: GTK is a framework that enables you to implement a cross-platform GUI. It's a completely free and open-source framework licensed under the GNU GPL, which allows you to use it in any project (even a commercial one).
gboolean
gtk_builder_value_from_string_type (....)
{
g_set_error (error,
GTK_BUILDER_ERROR,
GTK_BUILDER_ERROR_INVALID_VALUE,
"Object named \"%s\" is of type \"%s\" which is not compatible "
"with expected type \%s\"",
string, G_OBJECT_TYPE_NAME (object), g_type_name (type));
}
The PVS-Studio warning: V1094 The representation of conditional escape sequence '\%' is implementation-defined. Using this sequence in the string literal can lead to unexpected results. gtkbuilder.c 2674
A typo lurks in the string literal. Developers wanted to use an escape sequence for double quotes:
\"%s\"
But the quote character was left out, and a non-existent control sequence containing a slash and a percent sign appeared:
\%s\"
The way the sequence is processed depends on the compiler.
Value* op_color_number(enum Sass_OP op, ....)
{
double rval = rhs.value();
if ((op == Sass_OP::DIV || op == Sass_OP::DIV) && rval == 0) {
// comparison of Fixnum with Float failed?
throw Exception::ZeroDivisionError(lhs, rhs);
}
....
}
The PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions to the left and to the right of the '||' operator: op == Sass_OP::DIV || op == Sass_OP::DIV operators.cpp 250
The variable is compared twice to the Sass_OP::DIV named constant. I am pretty sure that the second constant should be Sass_OP::MOD.
It would be easier to notice the error if all blocks of the same type were formatted in a column in the conditions. Like this:
if (( op == Sass_OP::DIV
|| op == Sass_OP::DIV)
&& rval == 0) {
// comparison of Fixnum with Float failed?
throw Exception::ZeroDivisionError(lhs, rhs);
}
By the time I was done writing it, I realized that the code looked messy and even worse than it did before. Even though I prefer table-style formatting, it is somewhat irrelevant in this case. If you don't know what "table-style formatting" is, you may want to read about it in a mini-book "60 terrible tips for a C++ developer". Look for the terrible tip N20 — "Compact code".
gtk_css_parser_resolve_url (GtkCssParser *self,
const char *url)
{
char *scheme;
scheme = g_uri_parse_scheme (url);
if (scheme != NULL)
{
GFile *file = g_file_new_for_uri (url);
g_free (scheme);
return file;
}
g_free (scheme); // <=
if (self->directory == NULL)
return NULL;
return g_file_resolve_relative_path (self->directory, url);
}
The PVS-Studio warning: V575 [CWE-628, CERT-EXP37-C] The null pointer is passed into 'g_free' function. Inspect the first argument. gtkcssparser.c 189
The second call to the g_free function is unnecessary here because NULL is always passed to it. Nothing bad will happen because of it. You can pass a null pointer to g_free. It's just a redundant line of code.
This is the case when a static analyzer helps shorten or simplify the code. The next example is similar.
static gboolean
is_codepoint (const char *str)
{
int i;
/* 'U' is not code point but 'U00C0' is code point */
if (str[0] == '\0' || str[0] != 'U' || str[1] == '\0')
return FALSE;
for (i = 1; str[i] != '\0'; i++)
{
if (!g_ascii_isxdigit (str[i]))
return FALSE;
}
return TRUE;
}
The PVS-Studio warning: V590 [CWE-571, CERT-MSC01-C] Consider inspecting the 'str[0] == '\0' || str[0] != 'U'' expression. The expression is excessive or contains a misprint. gtkcomposetable.c 119
The condition:
if (str[0] == '\0' || str[0] != 'U' || str[1] == '\0')
return FALSE;
It can be shortened:
if (str[0] != 'U' || str[1] == '\0')
return FALSE;
Nothing has changed in terms of the code logic. So, should we simplify the code? In my opinion, yes. The more complex the code is, the more errors it "attracts".
static void
notify_observers_added (GtkActionMuxer *muxer,
GtkActionMuxer *parent)
{
....
Action *action;
....
while (....)
{
....
if (!action->watchers)
continue;
for (node = action ? action->watchers : NULL; node; node = node->next)
gtk_action_observer_primary_accel_changed (node->data,
GTK_ACTION_OBSERVABLE (muxer),
action_name, NULL);
....
}
The PVS-Studio warning: V595 [CWE-476, CERT-EXP12-C] The 'action' pointer was utilized before it was verified against nullptr. Check lines: 449, 452. gtkactionmuxer.c 449
The action pointer is used and only then checked if it's NULL.
if (!action->watchers)
....
for (node = action ? ....)
The check indicates that the pointer may be null. It's better to rewrite the code in this way, for example:
if (action == NULL || action->watchers == NULL)
continue;
for (node = action->watchers; node; node = node->next)
gtk_action_observer_primary_accel_changed (node->data,
GTK_ACTION_OBSERVABLE (muxer),
action_name, NULL);
This is not the only such error in the code. It is present in the following fragments:
static char *
strip_ulines (const char *text,
guint *accel_key)
{
....
new_text = malloc (strlen (text) + 1); // <=
....
return new_text;
}
static void
finish_text (UriParserData *pdata)
{
....
char *text;
....
if (pdata->strip_ulines && strchr (pdata->text_data->str, '_'))
{
text = strip_ulines (pdata->text_data->str, &pdata->accel_key); // <=
text_len = strlen (text);
}
else
{
text = pdata->text_data->str;
text_len = pdata->text_data->len;
}
....
if (text != pdata->text_data->str)
g_free (text); // <=
....
}
The PVS-Studio warning: V611 [CWE-762, CERT-MEM51-CPP] The memory was allocated using 'malloc' function but was released using the 'g_free' function. Consider inspecting operation logics behind the 'text' variable. gtklabel.c 3383
The memory is allocated using the malloc function and freed using the g_free function. The "Memory Allocation" section of Glib says that it's not allowed:
It's important to match g_malloc() (and wrappers such as g_new()) with g_free(), g_slice_alloc() (and wrappers such as g_slice_new()) with g_slice_free(), plain malloc() with free(), and (if you're using C++) new with delete and new[] with delete[]. Otherwise bad things can happen, since these allocators may use different memory pools (and new/delete call constructors and destructors).
void overwrite_and_free (gpointer data)
{
char *password = (char *) data;
if (password != NULL)
{
memset (password, 0, strlen (password));
g_free (password);
}
}
The PVS-Studio warning: V597 [CWE-14, CERT-MSC06-C] The compiler could delete the 'memset' function call, which is used to flush 'password' object. The memset_s() function should be used to erase the private data. gtkprintbackendcups.c 855
I described this code snippet in my previous article in early 2021. However, no changes have been made to the code yet. The code is dangerous because the compiler may remove the call to the memset function for optimization purposes, since this buffer is no longer in use and is immediately freed.
Perhaps it's worth to examine this case separately. I'm curious to reproduce it and see if I can actually get the library code to compile in such a way that the memset function call disappears. I decided to experiment with it later.
For those unfamiliar with this error pattern, here are the links:
Offset Offset::init(const char* beg, const char* end)
{
Offset offset(0, 0);
if (end == 0) {
end += strlen(beg);
}
offset.add(beg, end);
return offset;
}
The PVS-Studio warning: V769 [CWE-119, CERT-EXP08-C] The 'end' pointer in the 'end += strlen(beg)' expression equals nullptr. The resulting value is senseless and it should not be used. position.cpp 36
If the end pointer is null, the length of the string is added to it. The result is an invalid pointer that can't be used. Most likely, the developers wanted to write the following:
end = beg + strlen(beg);
static gboolean
query_tooltip (....)
{
....
char *s, *s2;
....
do {
s = g_strdup_printf ("%.*f", precision, self->scale);
l = strlen (s) - 1;
while (s[l] == '0')
l--;
if (s[l] == '.')
s[l] = '\0';
precision++;
} while (strcmp (s, "0") == 0);
label = gtk_label_new (s);
g_free (s);
....
}
The PVS-Studio warning: V773 [CWE-401, CERT-MEM31-C, CERT-MEM51-CPP] The 's' pointer was assigned values twice without releasing the memory. A memory leak is possible. demo3widget.c 79
The g_strdup_printf function is similar to the standard C function — sprintf. However, g_strdup_printf is safer, as it calculates the maximum required space and allocates the correct amount of memory to which it writes the resulting string. The returned string should be freed when no longer needed.
There is no buffer release in this code. Therefore, there will be a memory leak at each loop iteration. To avoid it, call g_free at each loop iteration:
s = NULL;
do {
g_free(s); // It is safe to pass NULL to g_free.
s = g_strdup_printf ("%.*f", precision, self->scale);
l = strlen (s) - 1;
while (s[l] == '0')
l--;
if (s[l] == '.')
s[l] = '\0';
precision++;
} while (strcmp (s, "0") == 0);
label = gtk_label_new (s);
g_free (s);
I can't tell if there is a way of writing code to avoid such errors in C programs where you have to manually manage memory allocation and deallocation. I guess only a careful code review, static, and dynamic analyzers will help.
The PVS-Studio analyzer detects other memory leaks in the GTK library. However, they are not as interesting. The errors boil down to the fact that there is an exit from a function where the developers forgot to free some memory. Look at the example:
static void
update_columns (GtkTreeView *view, ViewColumnModel *view_model)
{
....
int *new_order;
new_order = g_new (int, length); // <=
....
while (a->data == b->data)
{
a = a->next;
b = b->next;
if (a == NULL)
return; // <=
m++;
}
....
g_free (new_order);
....
}
The PVS-Studio warning: V773 [CWE-401, CERT-MEM31-C, CERT-MEM51-CPP] The function was exited without releasing the 'new_order' pointer. A memory leak is possible. testtreecolumns.c 441
In this case, if lists differ in length, a premature exit from the function occurs. At the same time, the memory, the pointer to which is stored in the new_order variable, is not freed.
static void
gtk_section_model_default_get_section (GtkSectionModel *self,
guint position,
guint *out_start,
guint *out_end)
{
guint n_items = g_list_model_get_n_items (G_LIST_MODEL (self));
if (position >= n_items)
{
*out_start = n_items;
*out_end = G_MAXUINT;
}
*out_start = 0;
*out_end = n_items;
}
The PVS-Studio warnings:
Something is definitely wrong with the code. Overwriting variable values is pointless. Perhaps the else branch is missing from the code:
if (position >= n_items)
{
*out_start = n_items;
*out_end = G_MAXUINT;
}
else
{
*out_start = 0;
*out_end = n_items;
}
I'm not sure if this is the way to fix the code, as I'm not familiar with the GTK project structure. I can say that the code contains an anomaly, but I can only guess how to fix it correctly.
static int
gtk_tree_list_row_sort_keys_compare (gconstpointer a,
gconstpointer b,
gpointer data)
{
....
gpointer *keysa = (gpointer *) a;
gpointer *keysb = (gpointer *) b;
....
resa = unpack (keysa, &keysa, &sizea);
resb = unpack (keysb, &keysb, &sizeb);
if (!resa)
return resb ? GTK_ORDERING_LARGER :
(keysa[2] < keysb[2] ? GTK_ORDERING_SMALLER :
(keysb[2] > keysa[2] ? GTK_ORDERING_LARGER : GTK_ORDERING_EQUAL));
else if (!resb)
return GTK_ORDERING_SMALLER;
....
}
Try to find the error without reading the warning. Can you see it? Our unicorn will be waiting for you while he drinks his coffee.
Anyway, I think looking for the error took some effort. So, it's good to have tools like static analyzers at your disposal.
The PVS-Studio warning: V547 [CWE-570] Expression 'keysb[2] > keysa[2]' is always false. gtktreelistrowsorter.c 158
The error is here:
.... keysa[2] < keysb[2] ? ....
.... keysb[2] > keysa[2] ? ....
The conditions are identical. The developers swapped the arrays, they also replaced < with >. The result is the same.
A very nice example of "code smell" to end the article. It can knock you off your feet :)
The PVS-Studio warning: V581 [CWE-670] The conditional expressions of the 'if' statements situated alongside each other are identical. Check lines: 674, 675. sass2scss.cpp 675
Do you see double space characters? Well, they are not spaces but the tab character!
Instead of typing '\t', someone hit TAB without thinking. This is a very bad idea for a number of reasons:
In "60 terrible tips for a C++ developer" I described the use of tabs in string literals. I invite you to read it if you'd like to learn even more about how not to write code.
Note. You may not quite understand why the analyzer issued the warning about identical checks. The thing is that the analyzer detected this error indirectly. To simplify the lexer and parser, we interpret tabs and spaces in the same way within the analyzer. In fact, all tabs are replaced with space characters. That's why the analyzer thinks that the conditions are the same.
typedef enum {
GDK_MEMORY_B8G8R8A8_PREMULTIPLIED,
GDK_MEMORY_A8R8G8B8_PREMULTIPLIED,
GDK_MEMORY_R8G8B8A8_PREMULTIPLIED,
....
// There's a lot cut out here.
// There are 27 named constants in total.
// GDK_MEMORY_N_FORMATS is the twenty-eighth constant,
// but it equals 27 because the values start from 0.
....
GDK_MEMORY_A16 GDK_AVAILABLE_ENUMERATOR_IN_4_12,
GDK_MEMORY_A16_FLOAT GDK_AVAILABLE_ENUMERATOR_IN_4_12,
GDK_MEMORY_A32_FLOAT GDK_AVAILABLE_ENUMERATOR_IN_4_12,
GDK_MEMORY_N_FORMATS
} GdkMemoryFormat;
static const char *format_name[] = {
"BGRAp", "ARGBp", "RGBAp",
"BGRA", "ARGB", "RGBA", "ABGR",
"RGB", "BGR", NULL
};
static const char *
format_to_string (GdkMemoryFormat format)
{
if (format < GDK_MEMORY_N_FORMATS)
return format_name[format];
else
return "ERROR";
}
The PVS-Studio warning: V557 [CWE-119, CERT-ARR30-C] Array overrun is possible. The value of 'format' index could reach 27. testupload.c 13
The GDK_MEMORY_N_FORMATS constant has nothing to do with the format_name array. The array clearly has less than 27 elements. So, the check doesn't protect against an array overrun. It's an interesting typo, and it's not even clear why it happened. I couldn't find suitable constants to check. So, this is definitely something for the developers to look into.
I hope this article will help fix some errors in the GTK library and give readers more insight into the methodology of static code analysis.
Please note that such checks are not the most efficient way to use the analyzer. They only demonstrate its capability to find bugs and help with code reviews. A static analyzer should be used on a regular basis, not just occasionally.
I invite you to try PVS-Studio on your work or home projects.
0