How often does your static analyzer struggle to identify the source code nuances? It probably happens more often than you'd like, doesn't it? In the article, our team shares how we've dealt with it. Let us introduce you to our new user annotation mechanism!
This is a good question. It seems like the analyzer should scan the entire source code without any hints. And you're 100% right. We're aligned on this—if the analyzer can handle something by itself, it's better not to burden the user with it. That's why the analyzer already knows about the most frequently used libraries and patterns. Unfortunately, it's not always possible to reliably and automatically spot any facts about the code.
Author's note. Of course, we have relevant tasks in our TODOs: to tackle the halting problem, to analyze code comments on kludges minor architecture solutions, and to read developers' minds. For obvious reasons, progress on these is moving extremely slowly.
In reality, we have to make a few compromises to boost the analysis. The manual annotation is the easiest and most reliable way to "introduce" your code to the analyzer.
For example, to handle one of the most frequent requests on how to correctly handle the nullable classes (various wrapper classes, std::optional, etc.), the analyzer needs to deduce the following facts:
The issue gets worse if:
A lot of logic, heuristics, and the analysis time with a non-guaranteed result. But we can instead just tell the analyzer directly what we want!
Starting with versions 7.31 (for C and C++) and 7.32 (for C#), PVS-Studio has introduced the option to manually annotate functions and classes via separate JSON files.
You could ask why we made the annotation mechanism in separate files and in JSON format when we could have just made them directly in the code. And it's more convenient to keep the code and its annotations together, isn't it? Yes, we agree, but as always there are reasons for this:
Of course, we plan to enhance the annotation mechanism in code, but that's another story. We've dug up many interesting insights while designing this feature, so let us share them with you! Stay tuned for updates :)
Even now, the new custom annotation system enables developers to set the following parameters:
For types:
For functions:
Let's see what the annotation of the nullable-type template looks like. Yes, it's not the simplest case but shows the power of the annotation mechanism.
Let's say we have some code that looks like this:
constexpr struct MyNullopt { /* .... */ } my_nullopt;
template <typename T>
class MyOptional
{
public:
MyOptional();
MyOptional(MyNullopt);
template <typename U>
MyOptional(U &&val);
public:
bool HasValue() const;
T& Value();
const T& Value() const;
private:
/* implementation */
};
Here are some clarifications on the code:
Now we can write the following annotation:
{
"version": 2,
"annotations": [
{
"type": "class",
"name": "MyOptional",
"attributes": [ "nullable" ],
"members": [
{
"type": "ctor",
"attributes": [ "nullable_uninitialized" ]
},
{
"type": "ctor",
"attributes": [ "nullable_uninitialized" ],
"params": [
{
"type": "MyNullopt"
}
]
},
{
"type": "ctor",
"template_params": [ "typename U" ],
"attributes": [ "nullable_initialized" ],
"params": [
{
"type": "U &&val"
}
]
},
{
"type": "function",
"name": "HasValue",
"attributes": [ "nullable_checker", "pure", "nodiscard" ]
},
{
"type": "function",
"name": "Value",
"attributes": [ "nullable_getter", "nodiscard" ]
}
]
}
]
}
Now the analyzer can warn us about the use of the dangerous nullable type:
We acknowledge that annotations in separate files cause issues. So, we also suggest a few enhancements to mitigate them.
To help you understand how to use the user annotation mechanism, we've prepared a list of examples for the most common scenarios—the markup of the format output functions (C++), marking functions as dangerous/deprecated (C++), etc. You can learn more in the documentation section.
We support the versioning of the JSON schemas for each available language. These schemas let modern text editors and IDEs validate, suggest possible values, and show hints while editing.
When you write your own annotation file, add the $schema field to it and set value to schema for the required language. For example, the value for the C++ analyzer looks like this:
{
"version": 2,
"$schema": "https://files.pvs-studio.com/media/custom_annotations/v2/cpp-annotations.schema.json",
"annotations": [
{ .... }
]
}
This enables Visual Studio Code to provide hints when creating annotations:
You can find the up-to-date list of languages available for annotation and their schemas in the documentation.
Not all issues can be detected when validating the JSON schemas. So, we created a special diagnostic rule, V019, that helps if something is wrong. This may be missing annotation files, parsing and annotation errors, etc.
We've designed the annotation mechanism, so that you can write as little as possible. For example, we've simplified the selection of function overloads in C++.
If you want the annotation to apply to all functions with this name, you can omit the params field when describing the function.
// Code
void foo(); // dangerous
void foo(int); // dangerous
void foo(float); // dangerous
// Annotation
{
....
"type": "function",
"name": "foo",
"attributes": [ "dangerous" ]
....
}
If you need to annotate the parameter-free function, just explicitly specify the following:
// Code
void foo(); // dangerous
void foo(int); // ok
void foo(float); // ok
// Annotation
{
....
"type": "function",
"name": "foo",
"attributes": [ "dangerous" ],
"params": []
....
}
In addition, you can also use wildcard characters. For instance, they help you omit any function parameters if they aren't relevant to the annotation. The following wildcard characters are available now:
Take a look how you can apply this, for example, to mark up the formatted I/O functions. Let's say you have a set of functions for outputting text:
namespace Foo
{
void LogAtExit(const char *fmt, ...);
void LogAtExit(const char8_t *fmt, ...);
void LogAtExit(const wchar_t *fmt, ...);
void LogAtExit(const char16_t *fmt, ...);
void LogAtExit(const char32_t *fmt, ...);
}
No need to annotate each function in this case. Just write one and replace the changing type of the first parameter with a wildcard character:
{
"version": 2,
"annotations": [
{
"type": "function",
"name": "Foo::LogAtExit",
"attributes": [ "noreturn" ],
"params": [
{
"type": "?",
"attributes" : [ "format_arg", "not_null", "immutable" ]
},
{
"type": "...",
"attributes": [ "immutable" ]
}
]
}
]
}
We're sure that our new user annotation mechanism simplifies developers' workflows and boosts the accuracy of the static code analysis! We'd love for you to experience it firsthand and see how it works. To get started, just download the analyzer using the link. A picture is worth a thousand words, as they say :)