Hello there! In this article, we'll look at the free version (available to the developers of free and open-source software) of the PVS-Studio static analyzer in action. What we are going to check today is the source code of the Reiser4 file system and its utilities.
This article was originally posted on the Habrahabr website and reposted here by permission of the author.
I hope all of you who are about to read this article have heard, if only in passing, about the static code analyzer PVS-Studio. If you haven't, follow this link to read a brief product description.
The developer company also runs an official blog on Habrahabr where they frequently post reports with the analysis results of various open-source projects.
More information on Reiser4 can be found on the kernel wiki page.
Let's start with the Reiser4 utilities, specifically the libaal library. Then we'll check the reiser4progs tools and round off with a review of the defects found in the kernel code.
We need to install PVS-Studio to get started. The official website provides deb and rpm packages alongside an ordinary installation archive. Choose whatever option is best for you.
Next, we need to activate the free license. Open-source software developers have to insert the following lines in the beginning of each source file (it's not necessary to add them to header files):
// This is an open source non-commercial project. Dear PVS-Studio, please check it.
// PVS-Studio Static Code Analyzer for C, C++ and C#: http://www.viva64.com
Let's write a small bash script so that we don't have to repeat that process by hand for each file. I'll use the sed stream editor to write the script (the following instruction is written in one line):
#!/usr/bin/bash
for str in $(find $1 -name '*.c'); do
sed -i -e '1 s/^/\/\/ This is an open source non-commercial project.
Dear PVS-Studio, please check it.\n\/\/ PVS-Studio Static Code
Analyzer for C, C++ and C\#: http:\/\/www.viva64.com\n\n/;' $str
done
In addition, let's write another script to facilitate project building and PVS-Studio launch:
#!/usr/bin/bash
pvs-studio-analyzer trace -- make -j9 || exit 1
pvs-studio-analyzer analyze -o log.log -j9 || exit 1
plog-converter -a GA:1,2 -t tasklist log.log || exit 1
We are ready to go. The libaal library comes first.
libaal is a library that provides abstraction of Reiser4 structures and is used by reiser4progs.
Analysis log: log1.txt
If we agree to ignore the warnings dealing with redefinition of the standard data types, then possible bugs are found only in lines 68, 129, and 139 of the src/bitops.c file:
V629 Consider inspecting the 'byte_nr << 3' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type.
Lines 129 and 139 contain the following code:
bit_t aal_find_next_set_bit(void *map, bit_t size, bit_t offset)
{
....
unsigned int byte_nr = offset >> 3;
....
unsigned int nzb = aal_find_nzb(b, bit_nr);
....
if (nzb < 8)
return (byte_nr << 3) + nzb;
....
}
This defect can be easily fixed by replacing the unsigned int type with bit_t in the variable declarations.
As for line 68:
bit_t aal_find_first_zero_bit(void *map, bit_t size)
{
....
unsigned char *p = map;
unsigned char *addr = map;
....
return (p - addr) << 3;
....
}
it's a mystery for me why PVS-Studio believes the value of (p-addr) to be 32-bit. Even sizeof() yields the proper 8 bytes (I'm working on amd64).
Analysis log: log2.txt
Now, reiser4progs has much more interesting, and sometimes sadder, things to show. By the way, here's what Edward Shishkin said about these tools: "The author left right after these progs were written, and no one has since looked into that code (except for a couple of times when I was asked to fix fsck). So I'm not surprised by that pile of bugs." Indeed, it's no surprise that such specific bugs are still there after so many years.
The first serious error is found in the plugin/key/key_short/key_short_repair.c file:
V616 The 'KEY_SHORT_BAND_MASK' named constant with the value of 0 is used in the bitwise operation.
errno_t key_short_check_struct(reiser4_key_t *key)
{
....
if (oid & KEY_SHORT_BAND_MASK)
key_short_set_locality(key, oid & !KEY_SHORT_BAND_MASK);
....
}
KEY_SHORT_BAND_MASK is the constant 0xf000000000000000ull, which means the Boolean NOT operation produces false here (in C, all values other than 0 are considered true), i.e., in fact, 0. However, the programmer obviously meant the bitwise NOT (~) operation rather than the Boolean NOT. This warning was triggered several times by different files.
Next comes plugin/hash/tea_hash/tea_hash.c with errors such as this:
V547 Expression 'len >= 16' is always false.
Wait... It's not really an error - it's some sort of black magic or a dirty trick (if you don't believe in magic). Why? Well, would you call the code below clear and straightforward without a deep understanding of the processor's and operating system's inner workings and the programmer's idea?
uint64_t tea_hash_build(unsigned char *name, uint32_t len)
{
....
while(len >= 16)
{
....
len -= 16;
....
}
....
if (len >= 12)
{
if (len >= 16)
*(int *)0 = 0;
....
}
....
}
What'd you say? This is not an error, but we'd better leave this code alone unless we know what goes on here. Let's try to figure it out.
The line *(int *)0 = 0; would trigger a SIGSEGV in a regular program. If you search for information about the kernel, you'll find that this statement is used to make the kernel generate an oops. This subject was discussed in the kernel developers' newsgroup (here), and Torvalds himself mentioned it too. So, if an assignment like that happens, in some mysterious ways, to execute inside the kernel code, you'll get an oops. Why check for the "impossible" condition is something that only the author himself knows, but, as I said, we'd better let the thing be unless we know how it works.
The only thing we can safely investigate is why the V547 warning was triggered. The len >= 16 expression is always false. The while loop is executed as long as the value of len is greater than or equal to 16, while the value 16 is subtracted at the end of the loop body at every iteration. This means the variable can be represented as len = 16*n+m, where n and m are integers and m<16. It's obvious that once the loop is over, all the 16*n's will have been subtracted, leaving only m.
The other warnings here follow the same pattern.
The following error is found in the plugin/sdext/sdext_plug/sdext_plug.c file: V595 The 'stat' pointer was utilized before it was verified against nullptr. Check lines: 18, 21.
static void sdext_plug_info(stat_entity_t *stat)
{
....
stat->info.digest = NULL;
if (stat->plug->p.id.id != SDEXT_PSET_ID || !stat)
return;
....
}
Either it's a banal typo or the author intended to write something else. The !stat check looks as if it were a nullptr check, but it doesn't make sense for two reasons. Firstly, the stat pointer has been already dereferenced. Secondly, this expression is evaluated from left to right, in accordance with the standard, so if it's really a nullptr check, it should be moved to the beginning of the condition since the pointer is originally dereferenced earlier in that same condition.
The plugin/item/cde40/cde40_repair.c file triggered a number of warnings such as this:
V547 Expression 'pol == 3' is always true.
static errno_t cde40_pair_offsets_check(reiser4_place_t *place,
uint32_t start_pos,
uint32_t end_pos)
{
....
if (end_offset == cde_get_offset(place, start_pos, pol) +
ENTRY_LEN_MIN(S_NAME, pol) * count)
{
return 0;
}
....
}
The programmer must have meant a construct of the A == (B + C) pattern but inadvertently wrote it as (A == B) + C.
upd1. It's my mistake; I confused the precedence of + and ==
The plugin/object/sym40/sym40.c file contains a typo:
V593 Consider reviewing the expression of the 'A = B < C' kind. The expression is calculated as following: 'A = (B < C)'.
errno_t sym40_follow(reiser4_object_t *sym,
reiser4_key_t *from,
reiser4_key_t *key)
{
....
if ((res = sym40_read(sym, path, size) < 0))
goto error;
....
}
This issue is similar to the previous one. The res variable is assigned the result of a Boolean expression. The programmer is obviously using a C "trick" here, so the expression should be rewritten as (A = B) < C.
Another typo or mistake made from inattention. File libreiser4/flow.c:
V555 The expression 'end - off > 0' will work as 'end != off'.
int64_t reiser4_flow_write(reiser4_tree_t *tree, trans_hint_t *hint)
{
....
uint64_t off;
uint64_t end;
....
if (end - off > 0)
{
....
}
....
}
There are two integer variables here. Their difference is ALWAYS greater than or equal to zero because, from the viewpoint of how integers are represented in computer memory, subtraction and addition are, in effect, the same operation to the processor (Two's complement). The condition was more likely meant to check if end > off.
Another probable typo:
V547 Expression 'insert > 0' is always true.
errno_t reiser4_flow_convert(reiser4_tree_t *tree,
conv_hint_t *hint)
{
....
for (hint->bytes = 0; insert > 0; insert -= conv)
{
....
if (insert > 0)
{
....
}
....
}
}
The code is contained in a loop, and the loop body is executed only when insert > 0, so the condition is always true. It's either a mistake, and therefore something else is missing, or a pointless check.
V547 Expression 'ret' is always false.
static errno_t repair_node_items_check(reiser4_node_t *node,
place_func_t func,
uint8_t mode,
void *data)
{
....
if ((ret = objcall(&key, check_struct) < 0))
return ret;
if (ret)
{
....
}
....
}
The first condition contains a construct of the A = ( B < 0 ) pattern, but what was more likely meant is (A = B) < C.
The librepair/semantic.c file seems to house another "black magic" thing:
V612 An unconditional 'break' within a loop.
static reiser4_object_t *cb_object_traverse(reiser4_object_t *parent,
entry_hint_t *entry,
void *data)
{
....
while (sem->repair->mode == RM_BUILD && !attached)
{
....
break;
}
....
}
The while loop here is used as an if statement because the loop body will be executed only once (since there's a break at the end) if the condition is true or will be skipped otherwise.
Now guess what comes next?
Exactly - a typo! The code is still looking like it was "abandoned at birth". This time, the problem is in the file libmisc/profile.c:
V528 It is odd that pointer to 'char' type is compared with the '\\0' value. Probably meant: *c + 1 == '\\0'.
errno_t misc_profile_override(char *override)
{
....
char *entry, *c;
....
if (c + 1 == '\0')
{
....
}
....
}
Comparing a pointer with a terminal null character is a brilliant idea, no doubt, but the programmer more likely meant the check *(c + 1) == '\0', as the *c + 1 == '\0' version doesn't make much sense.
Now let's discuss a couple of warnings dealing with the use of fprintf(). The messages themselves are straightforward, but we'll need to look in several files at once to understand what's going on.
First we'll peek into the libmisc/ui.c file.
V618 It's dangerous to call the 'fprintf' function in such a manner, as the line being passed could contain format specification. The example of the safe code: printf("%s", str);
Here's what we see:
void misc_print_wrap(void *stream, char *text)
{
char *string, *word;
....
for (line_width = 0; (string = aal_strsep(&text, "\n")); )
{
for (; (word = aal_strsep(&string, " ")); )
{
if (line_width + aal_strlen(word) > screen_width)
{
fprintf(stream, "\n");
line_width = 0;
}
fprintf(stream, word);
....
}
....
}
}
Let's find the code using this function. Here it is, in the same file:
void misc_print_banner(char *name)
{
char *banner;
....
if (!(banner = aal_calloc(255, 0)))
return;
aal_snprintf(banner, 255, BANNER);
misc_print_wrap(stderr, banner);
....
}
Now we're looking for BANNER - it's in include/misc/version.h:
#define BANNER \
"Copyright (C) 2001-2005 by Hans Reiser, " \
"licensing governed by reiser4progs/COPYING."
So, no injection danger.
Here's another issue of the same kind, this time in the files progs/debugfs/browse.c and progs/debugfs/print.c. They employ the same code, so we'll discuss only browse.c:
static errno_t debugfs_reg_cat(reiser4_object_t *object)
{
....
char buff[4096];
....
read = reiser4_object_read(object, buff, sizeof(buff));
if (read <= 0)
break;
printf(buff);
....
}
Looking for the reiser4_object_read() function:
int64_t reiser4_object_read(
reiser4_object_t *object, /* object entry will be read from */
void *buff, /* buffer result will be stored in */
uint64_t n) /* buffer size */
{
....
return plugcall(reiser4_psobj(object), read, object, buff, n);
}
Finding out what plugcall() does - it turns out to be a macro:
/* Checks if @method is implemented in @plug and calls it. */
#define plugcall(plug, method, ...) ({ \
aal_assert("Method \""#method"\" isn't implemented " \
"in "#plug"", (plug)->method != NULL); \
(plug)->method(__VA_ARGS__); \
})
Again, we need to find out what method() does, and it, in its turn, depends on plug, and plug is reiser4_psobj(object):
#define reiser4_psobj(obj) \
((reiser4_object_plug_t *)(obj)->info.pset.plug[PSET_OBJ])
If we dig a bit deeper, we'll find that all of these are constant strings too:
char *pset_name[PSET_STORE_LAST] = {
[PSET_OBJ] = "object",
[PSET_DIR] = "directory",
[PSET_PERM] = "permission",
[PSET_POLICY] = "formatting",
[PSET_HASH] = "hash",
[PSET_FIBRE] = "fibration",
[PSET_STAT] = "statdata",
[PSET_DIRITEM] = "diritem",
[PSET_CRYPTO] = "crypto",
[PSET_DIGEST] = "digest",
[PSET_COMPRESS] = "compress",
[PSET_CMODE] = "compressMode",
[PSET_CLUSTER] = "cluster",
[PSET_CREATE] = "create",
};
Again, no injections possible.
The remaining issues are either errors of the same patterns as discussed above or defects that I don't think to be relevant.
We have finally reached the Reiser4 code in the kernel. To avoid building the entire kernel, let's modify the script we've written for launching PVS-Studio to build only the code of Reiser4:
#!/usr/bin/bash
pvs-studio-analyzer trace -- make SUBDIRS=fs/reiser4 -j9 || exit 1
pvs-studio-analyzer analyze -o log.log -j9 || exit 1
plog-converter -a GA:1,2 -t tasklist log.log || exit 1
Thus we can have it build only the source code located in the folder fs/reiser4.
Analysis log: log3.txt
We'll ignore the warnings dealing with redefinition of the standard types in the headers of the kernel itself since the standard headers are not used in the build; and we are not interested in the kernel code anyway.
The first file to examine is fs/reiser4/carry.c.
V522 Dereferencing of the null pointer 'reference' might take place. The null pointer is passed into 'add_op' function. Inspect the third argument. Check lines: 564, 703.
static carry_op *add_op(carry_level * level, /* &carry_level to add
* node to */
pool_ordering order, /* where to insert:
* at the beginning of @level;
* before @reference;
* after @reference;
* at the end of @level */
carry_op * reference /* reference node for insertion */)
{
....
result =
(carry_op *) reiser4_add_obj(&level->pool->op_pool, &level->ops,
order, &reference->header);
....
}
reference must be checked for NULL because later in the code, you can see the following call to the function declared above:
carry_op *node_post_carry(carry_plugin_info * info /* carry
* parameters
* passed down to node
* plugin */ ,
carry_opcode op /* opcode of operation */ ,
znode * node /* node on which this
* operation will operate */ ,
int apply_to_parent_p /* whether operation will
* operate directly on @node
* or on it parent. */ )
{
....
result = add_op(info->todo, POOLO_LAST, NULL);
....
}
where add_op() is explicitly called with the value of reference set to NULL, which results in an oops.
Next error:
V591 Non-void function should return a value.
static cmp_t
carry_node_cmp(carry_level * level, carry_node * n1, carry_node * n2)
{
assert("nikita-2199", n1 != NULL);
assert("nikita-2200", n2 != NULL);
if (n1 == n2)
return EQUAL_TO;
while (1) {
n1 = carry_node_next(n1);
if (carry_node_end(level, n1))
return GREATER_THAN;
if (n1 == n2)
return LESS_THAN;
}
impossible("nikita-2201", "End of level reached");
}
This warning tells us that the function is non-void and, therefore, must return some value. The last line proves this is not an error because the case when while stops executing is an error.
V560 A part of conditional expression is always true: (result == 0).
int lock_carry_node(carry_level * level /* level @node is in */ ,
carry_node * node /* node to lock */)
{
....
result = 0;
....
if (node->parent && (result == 0))
{
....
}
}
This is straightforward: the value of result doesn't change, so it's okay to omit the check.
V1004 The 'ref' pointer was used unsafely after it was verified against nullptr. Check lines: 1191, 1210.
carry_node *add_new_znode(znode * brother /* existing left neighbor
* of new node */ ,
carry_node * ref /* carry node after which new
* carry node is to be inserted
* into queue. This affects
* locking. */ ,
carry_level * doing /* carry queue where new node is
* to be added */ ,
carry_level * todo /* carry queue where COP_INSERT
* operation to add pointer to
* new node will ne added */ )
{
....
/* There is a lot of possible variations here: to what parent
new node will be attached and where. For simplicity, always
do the following:
(1) new node and @brother will have the same parent.
(2) new node is added on the right of @brother
*/
fresh = reiser4_add_carry_skip(doing,
ref ? POOLO_AFTER : POOLO_LAST, ref);
....
while (ZF_ISSET(reiser4_carry_real(ref), JNODE_ORPHAN))
{
....
}
....
}
What happens in this check is that ref is checked for nullptr by the ternary operator and then passed to the reiser4_carry_real() function, where null-pointer dereferencing may take place with no prior nullptr check. However, that never happens. Let's look into the reiser4_carry_real() function:
znode *reiser4_carry_real(const carry_node * node)
{
assert("nikita-3061", node != NULL);
return node->lock_handle.node;
}
As you can see, the node pointer is checked for nullptr inside the function body, so everything is OK.
Next comes a probably incorrect check in the file fs/reiser4/tree.c:
V547 Expression 'child->in_parent.item_pos + 1 != 0' is always true.
int find_child_ptr(znode * parent /* parent znode, passed locked */ ,
znode * child /* child znode, passed locked */ ,
coord_t * result /* where result is stored in */ )
{
....
if (child->in_parent.item_pos + 1 != 0) {
....
}
We need to find the declaration of item_pos to find out what exactly it is. After searching in a few files we get the following:
struct znode
{
....
parent_coord_t in_parent;
....
} __attribute__ ((aligned(16)));
....
typedef struct parent_coord
{
....
pos_in_node_t item_pos;
} parent_coord_t;
....
typedef unsigned short pos_in_node_t;
In the comments, Andrey Karpov explained what this error is about. The expression is cast to type int in the if statement, so no overflow will occur even if item_pos is assigned the maximum value since casting the expression to int results in the value 0xFFFF + 1 = 0x010000 rather than 0.
All the other bugs either follow one of the patterns discussed above or are false positives, which we have also talked about.
They're simple.
Firstly, PVS-Studio is cool. A good tool helps you do your job better and faster if you know how to handle it. As a static analyzer, PVS-Studio has more than once proved to be a top-level tool. It provides you with the means to detect and solve hidden problems, typos, and mistakes.
Secondly, be careful writing code. Don't use C "tricks" unless it's the only legal way to implement some feature. Always use additional parentheses in conditions to explicitly state the desired order of calculations because even if you are a super-duper hacker and C ace, you may simply confuse the operators' precedence and make a bunch of mistakes, especially when writing large portions of code at a time.
I'd like to thank the developers for such a wonderful tool! They have done a really great job adapting PVS-Studio to GNU/Linux systems and carefully designing the analyzer's implementation (see the details here). It elegantly integrates into build systems and generates logs. If you don't need integration, you can simply "intercept" compiler launches by running make.
And above all, thank you so much for giving students, open-source projects, and single developers the opportunity to use your tool for free! That's amazing!
0