Webinar: C++ semantics - 06.11
It's interesting to analyze projects and doubly so to analyze well-known projects, especially when you use them yourself. Even more interesting it would be to analyze a project with high-quality code - it would let you kill two birds with one stone: scan the project itself, proving or disproving the declared quality, and also see how well the analyzer has done its job. Having pondering this a bit, I decided the popular messenger Telegram would suit us best for this task.
Telegram is a free instant messaging system targeted at the international market and allowing its users to exchange both text messages and media files of various types. Telegram clients exist for Android, iOS, Windows Phone, OS X, Windows, and Linux.
The authors of this project are Pavel and Nikolai Durov, known as the authors of the popular Russian social network "Vkontakte" (VK). Special emphasis in Telegram is placed on the communication security and enhanced protection (which enables the users to create private self-destructing chats and so on). Messages are encrypted through the MTProto protocol developed by Nikolai Durov.
For our analysis, I took the desktop Windows client, whose source code can be found in the repository at GitHub.
I should warn you that the application employs quite a number of third-party libraries, so if you want to build it by yourself, it'll take some time and effort. However, the authors provide intelligible documentation on third-party software building and installation, so it shouldn't be a problem.
You may now be wondering about the title of this article. "How come?" you may ask. Scanning a project's source code with an analyzer is okay, but what do we mean by the "vice versa" analysis?
As I already said, we had anticipated the high quality of this project from the beginning. I am totally frank telling you that Telegram is developed by pros who do know their job and who, moreover, put application security in the first place. It would be just strange to find lots of errors there. Besides, they regularly run contests challenging people to try breaking their cryptography, which also helps to keep the code quality at the high level. So analysis of this project would be a nice experiment to test our analyzer's capabilities. Read on to find out more.
I used the PVS-Studio static code analyzer to scan the project, paying attention to the general-analysis (GA) and optimization (OP) warnings of the first and second severity levels.
I can actually give an assessment of the code quality in advance since we all know how good the quality of the VK social network was back in times when Pavel was its CEO. So let me assure you right off that Telegram is fine too. We have found pretty few errors in it, which was determined by 2 factors:
So we can say for sure the guys are doing their job excellently. However, our code analyzer still has managed to find a few pretty interesting issues we'll discuss further.
For this article, I picked only the most interesting samples from the total number of issues detected.
For some fragments, an exact estimate of whether or not they are errors and how they should be fixed is impossible, for it requires much more detailed study of the source code. This, by the way, is one more argument for how important it is that static analyzers be used by code authors themselves.
I'd also like to say a few words about the analysis procedure itself. Since we have a .sln-file, analysis launch is pretty easy. After all the third-party libraries are built and installed, you just need to make sure the solution itself is built without errors, and then start the project analysis in a few mouse clicks. Once it is over, you will only have to study the log file with the diagnostic messages.
Note. Since the moment of the source code check, the developer team has released several updates. So some code fragments may differ from the ones given in the article.
Let's examine the following code fragment. This fragment being singled out from the whole text, it's not difficult to spot the error here:
void Window::placeSmallCounter(.... int size, int count, ....)
{
....
QString cnt = (count < 100) ? QString("%1").arg(count) :
QString("..%1").arg(count % 10, 1, 10, QChar('0'));
int32 cntSize = cnt.size();
....
int32 fontSize;
if (size == 16) {
fontSize = 8;
} else if (size == 32) {
fontSize = (cntSize < 2) ? 12 : 12;
} else {
fontSize = (cntSize < 2) ? 22 : 22;
}
....
}
PVS-Studio's diagnostic message: V583 The '?:' operator, regardless of its conditional expression, always returns one and the same value: 12. Telegram window.cpp 1607
It's easy to find the error (two errors, to be more exact) now that the code with it is viewed separately from the rest of the program. When using the ternary operator, regardless of the logical result in the condition, the 'fontSize' variable will be assigned one and the same value. Most likely, instead of the repeating values '12' and '22' in each of the ternary operators respectively, like in the original code, there should have been the pairs of the values '12' and '22', without being repeated, in each.
The error is evident, isn't it? You may wonder how one could make it at all. Well, we are all humans and it's in our nature to make mistakes, and while it can be easily spotted in a small code fragment like that, it gets lost among the 1700+ lines of code in this file.
Pretty frequent is the error when a pointer is first dereferenced and only then is checked for being equal to nullptr. Telegram is no exception:
void DialogsWidget::dialogsReceived(....)
{
const QVector<MTPDialog> *dlgList = 0;
....
unreadCountsReceived(*dlgList);
....
if (dlgList)
....
}
PVS-Studio's diagnostic message: V595 The 'dlgList' pointer was utilized before it was verified against nullptr. Check lines: 1620, 1626. Telegram dialogswidget.cpp 1620
You can see in this fragment that the 'dlgList' pointer is checked only after it has been dereferenced. Null pointer dereferencing is undefined behavior, which means that your program may run well, or crash, or send all your passwords to Chinese hackers, or something worse may happen. So pointers must be checked for null before being dereferenced.
I found 14 more issues of this kind. In some cases, it's not that bad and there's really no error. It's just that checks there are repeated (check->dereferencing->check, the pointer remaining unchanged), but we are not going to harp on it. Let's go on.
The next suspicious code fragment:
bool psShowOpenWithMenu(....)
{
....
IEnumAssocHandlers *assocHandlers = 0;
....
if (....)
{
....
IEnumAssocHandlers *assocHandlers = 0;
....
}
....
}
PVS-Studio's diagnostic message: V561 It's probably better to assign value to 'assocHandlers' variable than to declare it anew. Previous declaration: pspecific_wnd.cpp, line 2031. Telegram pspecific_wnd.cpp 2107
Again, with the piece of code singled out and stripped of irrelevant details, it's easy to see a variable redefinition. In a method which is too long to fit into the screen, it's not that easy.
In the beginning, the 'assocHandlers' variable is defined, after which it undergoes some operations, but then another variable is defined bearing the same type and name (and in absolutely the same way), this second variable not being used in any way. You may argue it's not an error at all. Yes - for now. But the traps are already sprung and waiting for you to step into. The programmer to maintain this code in future may overlook this redefinition and that's when the error will show up. But, as we already mentioned many times, the earlier a bug is eliminated, the better. Try to avoid issues like that.
There was another similar code fragment. Here's the diagnostic message for it:
V561 It's probably better to assign value to 'ms' variable than to declare it anew. Previous declaration: window.cpp, line 1371. Telegram window.cpp 1467
The next example:
void HistoryImageLink::getState(.... const HistoryItem *parent, ....)
const
{
....
int skipx = 0, skipy = 0, height = _height;
const HistoryReply *reply = toHistoryReply(parent);
const HistoryForwarded *fwd = reply ? 0 :
toHistoryForwarded(parent);
....
if (reply) {
skipy = st::msgReplyPadding.top() + st::msgReplyBarSize.height() +
st::msgReplyPadding.bottom();
} if (fwd) {
skipy = st::msgServiceNameFont->height + st::msgPadding.top();
}
....
}
PVS-Studio's diagnostic message: V646 Consider inspecting the application's logic. It's possible that 'else' keyword is missing. Telegram history.cpp 5181
As the analyzer's warning suggests, the keyword 'else' should have been used, not a new condition. I can't say for sure how to fix this code. It might be that it shouldn't be fixed at all.
These are the only two branches where the 'skipy' variable is initialized to some value. You can see in this fragment that it is initially set to 0, after which (I'm not citing the source code, for it is too long) it is being incremented.
From that we conclude that the second 'if' condition might be redundant or even incorrect (if both conditions are true). The programmer may have meant to use an 'else-if' construct (judging by the format) - you can't say for sure looking from outside. Nevertheless, it might be a potential error.
The next suspicious code fragment:
void DialogsListWidget::addDialog(const MTPDdialog &dialog)
{
History *history = App::history(App::peerFromMTP(dialog.vpeer),
dialog.vunread_count.v, dialog.vread_inbox_max_id.v);
....
SavedPeersByTime &saved(cRefSavedPeersByTime());
while (!saved.isEmpty() && history->lastMsg->date < saved.lastKey())
{
History *history = App::history(saved.last()->id);
....
}
....
}
PVS-Studio's diagnostic message: V711 It is dangerous to create a local variable within a loop with a same name as a variable controlling this loop. Telegram dialogswidget.cpp 949
The warning says it all: in the loop body, a variable is declared, coinciding with the one used as a loop counter. What's dangerous about that? Changing the variable in the loop body won't affect the loop termination condition in any way (since it's another variable that changes), which may render some part of the termination condition incorrect (causing an infinite loop, for instance).
Even if it's not an error, it's still a hidden, tricky trap you might get into.
Here's another issue:
bool update()
{
....
wstring fname = from[i], tofname = to[i];
....
WCHAR errMsg[2048];
....
wsprintf(errMsg, L"Failed to update Telegram :(\n%s is not
accessible.", tofname);
....
}
PVS-Studio's diagnostic message: V510 The 'wsprintfW' function is not expected to receive class-type variable as third actual argument. Updater updater.cpp 255
The problem is with the third argument of the function - the object of the wstring type. Since the list of formal parameters of the wsprintf function ends with an ellipsis, it enables you to pass arguments of any types into it, which poses a certain danger. Only POD-types can be used as actual arguments of the ellipsis. As seen from the format string, the function is awaiting an argument of the 'wchar_t *' type but we pass an object instead, which may cause forming rubbish in the buffer or a program crash.
There was another code fragment with an excessive subexpression in the conditional statement:
QImage imageBlur(QImage img)
{
....
const int radius = 3;
....
if (radius < 16 && ....)
....
}
PVS-Studio's diagnostic message: V560 A part of conditional expression is always true: radius < 16. Telegram images.cpp 241
The meaning of the warning is crystal clear: a variable (moreover, a constant) is declared and immediately initialized, its value being compared to a numerical literal in the condition. Since neither the constant, nor the numerical literal (naturally) change, the condition will always be either true or false (true in this case).
I also came across code where a variable was assigned values twice, this variable not being used in any way between the assignments. It may indicate the presence of an error in case a different variable was meant. In this case, there's no danger (at least no explicit evidence of it), but it's still no good:
bool eBidiItemize(....)
{
....
dir = QChar::DirON; status.eor = QChar::DirEN;
dir = QChar::DirAN;
....
}
PVS-Studio's diagnostic message: V519 The 'dir' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 2084, 2085. Telegram text.cpp 2085
Fragments where variables are declared without being used afterwards do look strange. Obviously, there's nothing good about unused variables scattered all about the code. Here's an example:
void extractMetaData(AVDictionary *dict)
{
....
for (....)
{
....
QString tmp = QString::fromUtf8(value);
}
}
PVS-Studio's diagnostic message: V808 'tmp' object of 'QString' type was created but was not utilized. Telegram audio.cpp 2296
The 'tmp' variable is declared but not used anywhere afterwards. To initialize it, the programmer uses a method call. Moreover, it all happens inside the loop body, which only aggravates the problem.
It's not the only warning of this type; there were 16 more of them.
Scanning the Telegram project was pretty interesting and also helped us dot some i's.
First, we had wished to scan this project for a long time and finally managed to do that. Despite it requiring some time and effort to install the third-party software before the analysis, it didn't pose any serious problems thanks to intelligible manuals from the authors.
Second, the project code has proved to be very high-quality, which is pleasant. In their messenger, the authors put the main focus on ensuring correspondence confidentiality, and it would have been strange if we had found lots of errors in it.
Third, PVS-Studio has still managed to find a few interesting suspicious fragments (remember that I have discussed not all but only the most interesting of the detected issues in this article) despite the code being written by real pros and regular encryption breaking contests they run. It proves the high quality of our analyzer and reminds that such tools are vitally necessary for programmers.
0