Three Interviews About Static Code Analyzers

Interviewees and article structure
Interview with Acronis
Interview with AlternativaPlatform
Interview with Echelon
Conclusion

Hello, dear readers!

The author invites you to read three interviews with representatives of three large, modern and interesting projects to learn about their software development methodologies and about how they use static code analyzers in particular. The author hopes that you will find this article interesting. The following companies took part as interviewees: Acronis, AlternativaPlatform, Echelon Company.

Sincerely yours, Aleksandr Timofeev

Interviewees and article structure

The author addressed three companies to take the interviews:

— Acronis, developer of the Acronis Backup product designed for data backup and subsequent recovery

— AlternativaPlatform, developer of the "Tanki Online" project, a multiplayer browser game

— Echelon Company, developer of a series of products for code revision in the field of information security

All the companies were asked the same questions save Echelon – the questions were changed a bit for this company to better reflect the specifics of their work.

Interview with Acronis

The interviewee is Kirill Korotaev, Acronis Backup product development vice-president

Give us please an overview of the primary and most large-scale product of your company/project (the main point of the product, the language its code is written in, the size of the team working on it, the usual pace of commits in lines of code or Kbytes per 24 hours/week/month, for example; what VCS you use)

The main point of the Acronis Backup product we develop is about creating backup copies of users' data on their computers, notebooks and servers so that they could use these copies to recover the data later. Recovery may be needed when the computer starts malfunctioning, for example; or when one needs an earlier version of some file or document, or a file was lost.

99% of our entire project is written in C++. There are about 70 developers working on it. On average, we make 100 to 300 commits per week. We use SVN (Subversion).

Who and how analyzes the project code? How is the testing cycle organized? Is the tester team large? How does the company respond to error messages – do you have any established protocol to handle such situations?

We have architects and leaders who are well familiar with the code of those project parts they are responsible for; therefore, they carry out analysis of this code and know how to improve it. Every commit is passed through the code review system – that is, any change is first analyzed by programmers responsible for the corresponding code fragment.

Presently, the number of our testers is comparable to the number of developers. We employ both automatic and manual tests. For example, we have build validation tests, i.e. a set of tests to verify every new build. Ideally, a new build should be compiled after every commit into the code and tested immediately.

The process of addressing a revealed issue is the following. Any issue found by the testing department is registered in the Jira system (a more advanced paid counterpart of BugZilla). And all that is integrated with SVN – when, for example, a commit is made which addresses a particular issue, we add a reference to this commit to Jira. We may also learn about an issue from our users. They first contact our technical support service and if they reveal any bugs that should be analyzed, then, again, the information about them first gets to Jira, and we release bugfixes in the next few updates.

Do you use static code analysis tools? If yes, which then? Could you please give an example of the most remarkable and interesting issue found by analyzers? What results and statistics do you usually get when using analyzers? How often do you run checks and according to what scheme? How do you respond to an issue found by an analyzer?

Among analyzers we used earlier or use currently are various tools – for example, both free open-source Cppcheck and PVS-Studio. Of course, code analyzers should be used in any project. But they all are very different, each of them being good at catching a certain type of bugs – that's why I'm totally for employing a wide variety of development means.

We do find some interesting potential bugs every now and then. For example, one of the most difficult to find bugs is the one found by PVS-Studio when standard auto pointers from the STL library are used incorrectly. Or, here is another interesting error: when you multiply a sizeof from one structure or parameter by another sizeof, PVS-Studio reasonably notices that it is pretty strange, to put it mildly, to multiply one sizeof by another, for this operation logically implies getting a quadratic quantity result.

Sometimes static analyzers can figure out when a pointer is not checked for null before being used. But these are more complex checks as it is not always obvious if a pointer can be null in a certain code fragment. It's quite a good practice to run static analyzers over the code once per day. And we also get bugs to automatically be recorded into that very Jira system, which is very useful for the product under development.

What is your opinion regarding future methodologies of large-scale software development? As separate questions, what do you expect and would like to get from static code analysis tools in future?

Automated tools are and will go on developing. For example, there is not a single automated system nowadays that could pick tests relying on the modifications made to the code – that is, select only those tests that need to be run for some particular modification.

What the future of static analyzers is concerned, I think they will gradually learn to handle more and more issues. At the same time, they will be shifting towards more complex analysis and perhaps even become a guarantee of code's compliance with some protocol, for instance.

A few words for your colleagues and readers?

Write high-quality code, test it and don't forget to use a wide variety of methodologies – including static analyzers.

Interview with AlternativaPlatform

The interviewee is Aleksey Kviring, CTO of "Tanki Online" LLC

Give us please an overview of the primary and most large-scale product of your company/project (the main point of the product, the language its code is written in, the size of the team working on it, the usual pace of commits in lines of code or Kbytes per 24 hours/week/month, for example; what VCS you use)

Currently we have only one product like that which is the Tanki Online game. The server part is written in Java, the client part in AS3. We have about 20 programmers. We add approximately 5K lines of code per week. We use GIT as a VCS.

Who and how analyzes the project code? How is the testing cycle organized? Is the tester team large? How does the company respond to error messages – do you have any established protocol to handle such situations?

We use an approach typical of GIT. All the code runs through obligatory Code Review. We also use continuous integration, and the build server regularly checks code and runs tests over it.

Testing is done in a number of stages: first automatic testing, then manual testing by developers themselves (through playing the game), then by the tester team. If everything is alright, community testers join the testing process. And only after that, all the changes get into production. Our tester team is small – only three persons. But we intensively employ community testers: there are a few dozens of volunteers.

If some bug still gets into production somehow, it is fixed right after we detect it. Usually all such errors are fixed in a couple of days.

Do you use static code analysis tools? If yes, which then? Could you please give an example of the most remarkable and interesting issue found by analyzers? What results and statistics do you usually get when using analyzers? How often do you run checks and according to what scheme? How do you respond to an issue found by an analyzer?

We don't use such tools at the company level. In the past, I launched a couple of static analyzers just for interest, but they found nothing serious (JetBrain IDEA checker).

I think static analysis is very useful for complex languages such as C and C++. But for simpler languages like Java, it's not that relevant. Java is not subject to memory-related issues as a class. Its syntax is plain and clear, no alternative interpretations are allowed, many issues are caught by the compiler at the compilation stage. Development environments provide convenient refactoring tools, which excludes accidental errors resulting from manual code modifications.

There is one area I'd use static analysis in when working with Java. It has to do with checking a program for correct multithread execution. But there are simply no tools capable of that at present. Generally speaking, if a static analyzer is quality and can find real bugs, it will be useful for one's project.

What is your opinion regarding future methodologies of large-scale software development? As separate questions, what do you expect and would like to get from static code analysis tools in future?

Future belongs to automated testing systems, continuous integration systems, and code analyzers. What I expect from static analysis is the ability to analyze multithread applications and architectural solutions.

A few words for your colleagues and readers?

Don't be afraid of incorporating new technologies into your development cycle. Learn from more experienced fellow programmers. Revise your old solutions. And then you certainly will succeed.

Interview with Echelon

The interviewee is Andrey Fadin (a.fadin@cnpo.ru), chief designer of Echelon Company

Give us please an overview of your company and its business related to software security.

Echelon Company is both a developer of information security analysis means and an active user of these products within the framework of information protection means certification and commercial code audit projects.

Means of information security analysis developed by our company include the following:

AK-VS2, a cloud environment for conducting certification testing of source code for compliance with the requirements of undocumented capabilities absence control (up to Level 1 inclusively);
AppChecker, a product conducting signature-based and heuristic analysis of program code aimed at detecting beetles, critical software vulnerabilities, and other issues related to program code's defects;
PIK, a means to fix and compare checksums of files, folders and physical digital media;
Skaner-VS, a toolkit and environment to conduct network and local security audit including security scanners, traffic analysis means, means of search of residual information on physical media and a few other components.

The Echelon team managing code security analysis and penetration testing is an association of highly skilled IT and information security specialists established on the personnel, research, and engineering bases of Echelon Company and the leading technical university of Russia, Bauman Moscow State Technical University.

We work with most of the popular programming languages such as PHP, Java, C#, C/C++, Perl, Python, JavaScript, including their most recent standards.

Program code audit conducted by Echelon Company specialists allows us to solve the following tasks:

control of in-house and outsourced code's quality, detection of typical defects (coding or designing errors);
detection of intentionally planted beetles in code;
borrowed code control (analysis of software's external dependencies on open-source and other external components)

Software that has successfully passed the audit can be certified according to information security requirements in Echelon's test laboratory.

Give us please an overview of your experts' work (what doesn't refer to classified information): Who and how analyzes project code? How is the testing cycle organized? What is the regular protocol when addressing an important issue found in code?

The code audit team is formed from specialists of two basic types:

Specialists of the first type are Echelon test laboratory's experts experienced in establishing cooperation with developers of large-scale software projects (operating systems, firewalls) and also in team review of large amounts of code.

Specialists of the second type are developers (personnel of Echelon's Research&Development departments) who have high technical qualifications in various programming languages, their frameworks and typical libraries. Whenever possible, we try to cooperate with static analysis tools' developers themselves when conducting code audit, which allows them to appreciate the convenience of our analysis means directly from their own experience. Besides, since developers are better skilled in implementing new signatures for static analyzers, it does make sense to employ them for timely updates of defect base when required by the specifics of a software project under testing.

Speaking generally, the process of software development and testing is made up of the following stages:

Decomposing project code into components (when analyzing a third-party project)
Building a threat model, analyzing these components and their interaction interfaces for severe information security issues.
Running static and dynamic analysis tools taking into account the results of Stage 2.
Selective code review based on the results of Stages 3 and 2.
Preparing a report of potentially dangerous constructs we have detected and discussing the results with the project's developer team.

Stages 3, 4 and 5 are usually repeated 3-4 times because, depending on tne analysis results for each potential construct, either the software project is revised to eliminate the defect (which is followed by repetition of stages starting with Stage 3) or the issue is found to be an expert's incorrect assumption or false positive by a static analyzer (which is followed by repetition of stages starting with Stage 4).

A few words about static analysis tools you use: What tools do you use? Could you give an example of the most remarkable and interesting error found by analyzers? What results and statistics do you usually get when using analyzers? How do you respond to an issue found by an analyzer?

In their work, auditors use both our own solutions (AK-VS2, AppChecker) and open-source tools (Cppcheck, PMD) as well as purchased third-party commercial tools.

The algorithm of addressing issues was described in section 2. What the statistics of using analyzers is concerned, the ratio of false positives in large projects is usually above 50%, so we in any case have to employ an expert to compose the final list of potentially dangerous constructs found in project code. However, since the expert does not review the entire code but only a few critical parts of it which on average make not more than 5% of the entire code size, we can save a considerable amount of time on code analysis.

To avoid breaching any non-disclosure agreements, we unfortunately cannot tell you about errors found in particular products. But as our experience shows, most of interesting errors are related to:

use of hard-coded passwords (Use of Hard-coded Password, CWE-259) and other authentication data (Use of Hard-coded Credentials, CWE-798);
"easter eggs" and other hidden functionality (Hidden Functionality, CWE-912);
rather common errors related to race conditions and shared resources (Race Condition, CWE-362).

What is your opinion regarding future software development methodologies and, as separate questions, what do you expect and would like to get from static code analysis tools in future?

In our opinion, software verification will be getting more tightly connected with development processes, both within the framework of continuous integration systems and continuous delivery systems.

Tight integration with these systems will in future allow developers to fully control software development and delivery; that is, static analyzers will serve as kind of an IPS within these processes, blocking code failed to pass the quality gate at the level of commits and releases. From this viewpoint, any CI/CD system is also an interesting source of events for SIEM systems.

Rich prospects are also provided by the introduction of static analyzers into the model-driven development paradigm; tight integration with CASE-means will allow developers to reveal errors at the levels of syntax, software components and their interfaces, and even at the level of business requirements so that an analyst, for instance, could already at the system designing stage substantiate to customers why adding a certain access control role is necessary.

A few words for your colleagues and readers?

Dear colleagues, during the last decade, emphasis in the field of enterprise information security was put on network security as well as endpoint security.

However, when dealing with such tasks as detection of zero-day exploits, beetles and "implants" (code fragments and configurations planted into software for the purposes of state and industrial espionage), we appear to be facing the issue when classic network- and node-level information security means (intrusion detection systems, antivirus software) cannot efficiently handle such threats.

To solve these issues, we need a comprehensive approach that on the one hand implies centralization of enterprise information security management (SIEM-systems), and on the other, makes use of software structural decomposition into components with control over their origin, as well as static analysis of their contents and materials for their production (including source texts).

Conclusion

The author thanks the press services and experts of the companies that took part in these interviews for prompt and detailed answers to the interview questions. The author is also thankful to the OOO "Program Verification Systems" company, developer of a contemporary static code analyzer PVS-Studio and sponsor of this article. Without their support, it might hardly have been published at all.