Static code analysis
- Code review
- Static analysis as automated code review
- Why to introduce static analysis into the development process?
- Static Code Analysis Tools
- Static Code Analysis Advantages
- Static Code Analysis Disadvantages
- Static Application Security Testing (SAST)
- How to choose and introduce a static code analyzer?
- Examples of errors detected by static code analysis
- Other resources
Static code analysis is a method used to detect flaws, errors, and potential vulnerabilities in the source code. Static analysis is considered an automated code review process.
Code review is one of the oldest and most effective methods of defect detection. The process involves developers who attentively read the source code and give recommendations for improving it. While reading the code, programmers detect errors, or code fragments that can become errors in future. It is also considered that the code author should not explain how different parts of the program work. The reviewer should understand the program's execution algorithm from its text and comments. If the code is incomprehensible, it must be improved.
Usually, code review works well, since programmers notice someone else's errors much more easily. You can learn more about the code review method in Steve McConnell's wonderful book "Code Complete".
Static analysis as automated code review
The only major disadvantage of the code review method is an extremely high price. You need to regularly gather several programmers to review new code or re-review the code after applying recommended changes. The programmers also need to rest regularly. If you try reviewing large code fragments at once, your attention is quickly blunted, and the code review benefits fade away.
So, you want to regularly review the code, but it's too expensive. A compromise solution is to use static code analysis tools. They tirelessly analyze the source code and give the programmer recommendations to pay special attention to certain code fragments. Of course, this kind of software will never replace a proper code review performed by a team of programmers. However, the benefit/price ratio makes static analysis a handy practice used by many companies.
Why to introduce static analysis into the development process?
Implementing static code analysis, you will get:
- Errors and code smells detection (for example, non-portable or hard to read code).
- Potential vulnerabilities detection.
- Code formatting recommendations. Some static analyzers allow you to check whether the source code complies with the company's coding standard (controlling the number of indents in various constructs, using spaces/tabs, and so on).
- Metrics calculation. A software metric is a measure that lets you get a numerical value of some software property or its specifications.
- Code compliance with certain coding standards (MISRA, CWE, SEI CERT, etc.).
- Continuous code quality control. By collecting statistics, you can find out whether the error density increases or decreases over time. This makes clear which changes in the project development process were beneficial and which were not.
There are other ways to use static code analysis tools. For example, static analysis is used to control and train new employees, who are not yet familiar enough with the company's coding standards.
Static Code Analysis Tools
There are a lot of commercial and free static code analyzers. The Wikipedia website provides a large list of static analyzers: List of tools for static code analysis. Such tools support quite a lot of languages (C, C++, C#, Java, Ada, Fortran, Perl, Ruby, etc.).
If you wonder how code analyzers detect errors, check out the following example: PVS-Studio: static code analysis technology.
Static Code Analysis Advantages
Like any other error detection method, static analysis has its strengths and weaknesses. It is important to understand that there is no ideal testing method. Different types of software and methods will give different results. You can achieve high software quality only by using both of these methods.
The main advantage of static analysis is the possibility to significantly reduce the cost of eliminating software's defects. The earlier the error is detected, the lower is the cost of fixing it. Thus, according to McConnell's "Code Complete", fixing an error at the testing stage will cost ten times more than at the construction (coding) stage:
Figure 1. The average cost of fixing defects depending on the time they have been made and detected (the data are taken from the book "Code Complete" by S. McConnell).
Static analysis tools allow you to identify numerous errors at the coding stage, which significantly reduces the overall project development cost. For example, the PVS-Studio static code analyzer can run in the background immediately after compilation is done and notify the programmer if a potential error is found (see incremental analysis mode).
There are other advantages of static code analysis:
- Full code coverage. Static analyzers check even those code fragments which rarely get executed. These code fragments usually cannot be tested with the help of other methods. It allows you to find defects in exception and error handlers, or in the logging system.
- Static analysis doesn't depend on the used compiler and the environment where the compiled program will be executed. It allows you to find hidden errors which may reveal themselves only a few years after they were created. For instance, undefined behavior errors. Such errors can occur when switching to another compiler version, or when using other code optimization switches. Another interesting example of hidden errors is discussed in the article "Overwriting memory - why?".
- You can easily and quickly detect misprints and the consequences of using Copy-Paste. Detecting these errors with the help of other methods is usually extremely inefficient — a waste of time and effort. It's a pity when you have spent an hour debugging your code, just to find out that the error is in an expression of the "strcmp(A, A)"-kind. People usually don't remember such troubles when discussing typical errors. But practice shows that it takes a lot of time to detect them.
Static Code Analysis Disadvantages
- Static analysis is usually bad at finding memory leaks and concurrency errors. In order to detect such errors, you actually need to execute a part of the program virtually. This is extremely difficult to implement. Such algorithms take too much memory and processor time. Static analyzers usually limit themselves to find simple cases. A more efficient way to detect memory leaks and concurrency errors is to use dynamic analysis tools.
- A static analysis tool warns you about odd fragments. This means that the code can actually be quite correct. These are so-called 'false positive' reports. Only a programmer can understand whether the analyzer points to a real error, or it is just a false positive. Reviewing false positives takes work time and weakens attention to those code fragments which really do contain errors.
Errors detected by static analyzers are rather diverse. For example, here is the list of diagnostics implemented in the PVS-Studio tool. Some analyzers focus on a certain area, or certain types of defects. Others support certain coding standards, such as MISRA-C:1998, MISRA-C:2004, Sutter-Alexandrescu Rules, Meyers-Klaus Rules, etc.
Static Application Security Testing (SAST)
Static Application Security Testing (SAST) is a form of static analysis. These analyzers are focused on identifying potential vulnerabilities in order to protect application code from zero-day vulnerabilities. In other words, the development team must find and fix security defects at the coding stage, so that attackers could not use them later for their malicious purposes.
The PVS-Studio analyzer is also a SAST solution.
How to choose and introduce a static code analyzer?
The static analysis field is actively developing, new tools and new coding standards are emerging. Static analyzers implement new diagnostic rules, and some rules become obsolete. Different analyzers integrate into different IDEs, CI tools, cloud CI tools, and so on.
As a result, there is no way to comprehensively compare static analyzers and choose the "best" one. When you google something about static analyzers, you get just some overview articles which are not enough to make a decision. Therefore, it is reasonable to choose and try several tools which meet your requirements: language support, integration into CI/CD, IDE plugins, supported coding standards, etc. And then select the most suitable one that found real bugs in your project.
The next step is to introduce the tool into your development process. There are two excellent articles covering this topic:
- How to introduce a static code analyzer in a legacy project and not to discourage the team.
- Introduce Static Analysis in the Process, Don't Just Search for Bugs with It.
Examples of errors detected by static code analysis
- Zero, one, two, Freddy's coming for you.
- The Evil within the Comparison Functions.
- The Last Line Effect.
- Espressif IoT Development Framework: 71 shots in the foot.
- Date processing attracts bugs or 77 defects in Qt 6.
- Examples of errors that PVS-Studio found in LLVM 15.0.
- Errors and suspicious code fragments in .NET 6 sources.
- Wikipedia. Static program analysis.
- Coverity. A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World.
- John Carmack. Static Code Analysis.
- Andrey Karpov. C++ tools evolution: static code analyzers.
- Andrey Karpov, Victoria Khanieva. Machine learning in static analysis of program source code.
- Sergey Vasiliev. SAST in Secure SDLC: 3 reasons to integrate it in a DevSecOps pipeline.