Myths about static analysis. The fourth myth - programmers want to add their own rules into a static analyzer

Nov 04 2011

Author: Andrey Karpov

While communicating with people on forums, I noticed there are a few lasting misconceptions concerning the static analysis methodology. I decided to write a series of brief articles where I want to show you the real state of things.

The fourth myth is: "A static analyzer must enable users to add user-made rules. Programmers want to add their own rules."

No, they don't. They actually want to solve some tasks of searching for particular language constructs. It is not the same thing as creating diagnostic rules.

I have always answered that implementation of own rules is not the thing programmers actually want. And I never saw any other alternative than implementing diagnostic rules by the analyzer's developers at the request of programmers (an article on the subject). I have had a fruitful conversation with Dmitry Petunin recently. He is the director of an Intel department of compiler testing and software verification tool development. He enlarged my understanding of this subject and voiced the idea I had been pondering over but failed to give the final formulation of.

Dmitry confirmed my belief that programmers wouldn't write diagnostic rules. The reason is very simple - it is very hard. Some static analysis tools enable users to extend the rule set. But it is done rather as a pure formality or for convenience of the tool's developers themselves. You need to know the subject very deeply to be able to develop new diagnostic rules. If an enthusiast without skill starts creating them, they will be of little use.

My understanding of the issue was over at this point. Dmitry, being more skilled than I, helped me learn more. In brief, this is how the situation looks.

Indeed, programmers want to be able to search for some particular patterns/errors in their code. They really need it. For example, someone needs to find all the explicit conversions of the int type to float. This task cannot be solved by such tools as grep, since it is unknown what type the FOO() function will return in a "float(P->FOO())" -like construct. At this moment the programmer comes to the idea that he/she can implement search of such constructs by adding his/her own check into the static analyzer.

This is where the key point lies. The programmer does not need to create his/her analysis rules. He/she needs to solve a particular issue. What he/she wants is a very small task from the viewpoint of static analysis mechanisms. It is like using a car to light cigarettes with its cigarette lighter.

That's why both Dmitry and I don't support the idea of providing users with API to handle the analyzer. It is an extremely difficult task from the viewpoint of development. Besides, people will hardly use more than 1% of it. So, it's irrational. It is easier and cheaper for a developer to implement users' wishes than create a complex API for add-ons or a special language of rule description.

The readers may say: "then make only 1% of the functionality in API available, and everyone will be happy". Yes, right. But look where the emphasis has moved: from developing own rules we have come to the idea that we just need a tool similar to grep but possessing some additional information about program code.

There is no such a tool yet. If you want to solve some task, write to me, and we will try to implement it in the PVS-Studio analyzer. For example, we have recently implemented several requests on searching for explicit type conversions: V2003, V2004, V2005. It is much easier for us to implement such wishes than create and maintain an open interface. It's also easier for users themselves.

By the way, such a tool might appear some time later within the scope of Intel C++. Dmitry Petunin said they had discussed a probability of creating a grep-like tool possessing knowledge about code structure and variable types. But it was discussed just in theory. I don't know whether or not they really intend to create this tool.

#StaticAnalysis