Webinar: Parsing C++ - 10.10
The linking concept of today's article differs from usual. This time it is not one project, the source code of which was analyzed, but a number of warnings related to one and the same diagnostic rule in several projects. What's interesting about this? The point is that some of considered code fragments contain errors reproduced when working with the application and other fragments even represent vulnerabilities (CVE). In addition, at the end of the article you will find a small talk on security defects.
All errors, which will be regarded today in the article, have a similar pattern:
However, all fragments that will be considered, contain errors and are vulnerable to the intentionally malformed input. As data is received from a user, who can disrupt the logic of application execution, it was extremely tempting to try breaking something. That's what I did.
All problems listed below were detected by a PVS-Studio static analyzer that searches for errors in code, not only for C and C++ languages, but also for C# and Java ones.
It's already great to find a problem by a static analyzer, but to find and reproduce it - that's a totally different level of delight. :)
The first suspicious code fragment was detected in the fs_cli.exe module code, included in the FreeSWITCH distribution:
static const char *basic_gets(int *cnt)
{
....
int c = getchar();
if (c < 0) {
if (fgets(command_buf, sizeof(command_buf) - 1, stdin)
!= command_buf) {
break;
}
command_buf[strlen(command_buf)-1] = '\0'; /* remove endline */
break;
}
....
}
PVS-Studio warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen(command_buf)'.
The analyzer warns about suspicious access to the command_buf array by an index. It is considered suspicious because unchecked external data is used as an index. Data is external as it was received through the fgets function from the stdin. Data is unchecked as there was no check before using. The expression fgets(command_buf, ....) != command_buf doesn't count as in this case we check only the fact of receiving data, not its content.
The problem of this code is that under certain circumstances there will be a recording '\0' outside the array, which will lead to the undefined behavior. For this, it is enough to just enter a zero-length string (a zero-length string in terms of the C language, i.e. the one in which the first character will be '\0').
Let's get a rough estimate of what will happen when feeding a zero-length string to the function:
Ooops!
What is interesting here is that this analyzer warning can be pretty "grasped between fingers". In order to reproduce the problem, you need to:
Digging in sources for a while, I have formed a specific sequence of the problem reproducing:
You can find a video of reproducing the problem below:
A similar problem has been detected in the NcFTP project, but only it occurred already in two places. As the code looks similar, we'll consider only one problem case:
static int NcFTPConfirmResumeDownloadProc(....)
{
....
if (fgets(newname, sizeof(newname) - 1, stdin) == NULL)
newname[0] = '\0';
newname[strlen(newname) - 1] = '\0';
....
}
PVS-Studio warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen(newname)'.
Here, unlike the example from FreeSWITCH, the code is worse and more prone to problems. For example, the recording '\0' occurs regardless of the fact whether the reading using fgets happened successfully or not. So here there are even more possibilities how to disrupt the normal execution logic. Let's follow the proven way of zero-length strings.
The problem is reproduced less harder, than in case of FreeSWITCH. The sequence of steps is described below:
Reproducing of a problem is also available on a video:
In the OpenLDAP project (more precisely - in one of related utilities) developers make the same errors, as in FreeSWITCH. Attempt to delete newline character occurs only if a string was read successfully but there also is no protection from zero-length strings.
Code fragment:
int main( int argc, char **argv )
{
char buf[ 4096 ];
FILE *fp = NULL;
....
if (....) {
fp = stdin;
}
....
if ( fp == NULL ) {
....
} else {
while ((rc == 0 || contoper)
&&
fgets(buf, sizeof(buf), fp) != NULL) {
buf[ strlen( buf ) - 1 ] = '\0'; /* remove trailing newline */
if ( *buf != '\0' ) {
rc = dodelete( ld, buf );
if ( rc != 0 )
retval = rc;
}
}
}
....
}
PVS-Studio warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen(buf)'.
Let's omit redundant part so that the point of the problem was more obvious:
while (.... && fgets(buf, sizeof(buf), fp) != NULL) {
buf[ strlen( buf ) - 1 ] = '\0';
....
}
This code is better than in NcFTP, but is still vulnerable. If you feed in a zero-length string when calling fgets:
The errors, reviewed above, are quite juicy, they can be consistently reproduced, you can "touch" them. Unless, I just didn't get around to reproducing problems on OpenLDAP. Nevertheless, you cannot call them vulnerabilities at least for the reason that these problems aren't assigned CVE-IDs.
However, some real vulnerabilities have the same problem pattern. Both of the code fragments given below, relate to the libidn project.
Code fragment:
int main (int argc, char *argv[])
{
....
else if (fgets (readbuf, BUFSIZ, stdin) == NULL)
{
if (feof (stdin))
break;
error (EXIT_FAILURE, errno, _("input error"));
}
if (readbuf[strlen (readbuf) - 1] == '\n')
readbuf[strlen (readbuf) - 1] = '\0';
....
}
PVS-Studio warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen(readbuf)'.
The situation is similar, except that unlike the previous examples, where a recording by index -1 took place, a reading is happening here. However, it is still undefined behavior. This error was given its own CVE identifier (CVE-2015-8948).
After the problem has been discovered, the code was changed as follows:
int main (int argc, char *argv[])
{
....
else if (getline (&line, &linelen, stdin) == -1)
{
if (feof (stdin))
break;
error (EXIT_FAILURE, errno, _("input error"));
}
if (line[strlen (line) - 1] == '\n')
line[strlen (line) - 1] = '\0';
....
}
A bit surprised? Well, it happens. A new vulnerability, here is the corresponding CVE: CVE-2016-6262.
PVS-Studio warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen(line)'.
After another attempt, the problem was fixed by adding a check of the length of the input string:
if (strlen (line) > 0)
if (line[strlen (line) - 1] == '\n')
line[strlen (line) - 1] = '\0';
Let's take a look at the dates. Commit 'closing' CVE-2015-8948 - 10.08.2015. Commit closing CVE-2016-62-62 - 14.01.2016. So the difference between given fixes is 5 months! Here's the moment when you remind yourself about such an advantage of a static analyzer, as a detection of errors at the early stages of writing code...
I'm not going to give code examples from now on, instead of this - statistics and reasoning. In this section the author's opinion might be much more different than earlier in this article. : )
Note. I recommend checking out another article on the similar topic -"How Can PVS-Studio Help in the Detection of Vulnerabilities?". There are interesting examples of vulnerabilities that look like simple errors. Additionally, in that article, I talked a bit about terminology and the question why static analysis is a must have if you care about security issue.
Let's take a look at statistics about the number of detected vulnerabilities over the past 10 years to assess the situation. I took this data from the CVE Details site.
We've got an interesting situation here. Until 2014, the number of reported CVE hadn't not exceed 6000 units, and since than it hasn't been less. The most interesting thing here is, of course, statistics for the year 2017, which is the absolute leader (14714 units). With regard to the current year 2018, it hasn't ended yet, but already beats records - 15310 units.
Does this mean that all new soft is leaky as a sieve? I don't think so, and here's why:
So the emerging trend cannot be described as purely negative - vendors are more concerned about information security, problem searching tools are improving, undoubtedly, in a positive way.
Does this mean that you we relax and take it easy? I think not. If you are concerned about the security of your applications you should take as many security measures as possible. This is especially true when the source code is publicly available, because it:
I don't want to say that you do not need to open source your projects. Just be mindful of proper quality control measures / security.
Is static analysis an additional measure in this regard? Yes! Static analysis is good at finding potential vulnerabilities that can later become quite real.
It seems to me (admittedly, that wrong) that many consider vulnerabilities as a fairly high-level phenomenon. Well, yes and no. Problems in code that seem to be simple programming errors, may well be serious vulnerabilities. Again, some examples of such vulnerabilities are listed in the article mentioned earlier. We shouldn't underestimate 'simple' errors.
Don't forget that input data can have a zero length, it's necessary to take it into account.
Draw your own conclusions whether all this hype about vulnerabilities is just a fuss or there is a real problem.
On my part, I'll just suggest trying PVS-Studio on your project if you haven't done so already.
All the best!
0