Annotating Java entities in JSON format
- Structure of the annotation file
- Annotations for taint analysis
- Method annotations
- JSON Schema
- Examples
How to enable annotation files
You can learn more about how to enable the annotation file in this documentation.
Structure of the annotation file
The file content is a JSON object consisting of three mandatory fields: language
, version
, and annotations
.
The language
field must have the java
value. The version
field takes an integer-type value and specifies the mechanism version. Depending on the value, the markup file can be processed differently. Currently, only the value 1
is supported.
The annotations
field is an array of "annotation" objects:
{
"language": "java",
"version": 1,
"annotations":
[
{
...
},
{
...
}
]
}
For now, method annotations and constructor annotations are available.
Annotations for taint analysis
The analyzer provides a range of annotations for taint analysis, which can be used to track taint sources and sinks. It is also possible to mark up methods/constructors that validate tainted data. Therefore, if tainted data has been validated, the analyzer will not issue a warning when the data reaches the sink.
A different diagnostic rule is used to handle each type of vulnerability. At the moment, the analyzer provides the following diagnostic rules for identifying tainted data:
- V5309 — SQL injection;
- V5310 — OS command injection;
- V5311 — OS argument injection;
- V5312 — XPath injection;
- V5319 — Log injection;
- V5320 — Configuration injection;
- V5321 — LDAP injection;
- V5322 — Reflection injection;
- V5323 — Potentially tainted data is used to define CORS policy;
- V5327 — Regex injection;
- V5330 — XSS injection;
- V5332 — Path traversal vulnerability.
How taint annotations work
Each diagnostic rule has special annotations to mark taint sinks and methods/constructors that validate tainted data.
As for taint data sources, they are common to all diagnostic rules. However, such data can also be annotated.
Note. The attributes for taint annotations are detailed in the following documentation sections.
It's important to note that, in addition to user annotations, the analyzer already provides a range of taint annotations for various libraries. To give an example, passing the Console.readLine
method result to the Statement.execute
method can potentially lead to SQL injection. The analyzer provides annotations that indicate Console.readLine
as a source of tainted data and Statement.execute
as a sink where this tainted data could lead to SQL injection.
Thus, if we annotate the source of tainted data, the analyzer will recognize its exposure to existing sinks, and vice versa. If the sink is annotated, the analyzer will issue a warning when it encounters previously marked tainted data sources, such as Console.readLine
.
Method annotations
Note. The method annotation object must contain at least one optional field.
The method annotation object consists of the following fields:
The "type" field
The mandatory field. It takes a string with the method
value.
The "package" field
The mandatory field. It takes a string specifying the name of the namespace that contains the method.
The "type_name" field
The mandatory field. It takes a string that specifies the name of the class where the method is defined.
The "method_name" field
The mandatory field. It takes a string with the name of the method.
Note. Use <init>
to annotate the constructor.
The "attributes" field
The optional field. The string array that specifies the properties of an entity.
Possible method attributes
# |
Attribute name |
Attribute description |
---|---|---|
1 |
sql_injection_sink |
Passing tainted data to this method parameters results in SQL Injection |
2 |
os_command_injection_sink |
Passing tainted data to this method parameters results in OS Command Injection |
3 |
xpath_injection_sink |
Passing tainted data to this method parameters results in XPath Injection |
4 |
configuration_injection_sink |
Passing tainted data to this method parameters results in Configuration Injection |
5 |
ldap_injection_sink |
Passing tainted data to this method parameters results in LDAP Injection |
6 |
reflection_injection_sink |
Passing tainted data to this method parameters results in Reflection Injection |
7 |
regex_sink |
Passing tainted data to this method parameters results in Regex Injection |
8 |
xss_injection_sink |
Passing tainted data to this method parameters results in XSS Injection |
9 |
path_traversal_sink |
Passing tainted data to this method parameters results in a Path Traversal Vulnerability |
The "params" field
The optional field.
Annotating parameters is necessary to determine the right method for the markup. If this field is absent, then the method has zero or more parameters of any type. An empty parameter array ([ ]
) means no parameters.
The parameter annotation object consists of the following fields:
The "package" field
The mandatory field. It takes a string specifying the name of the namespace that contains the parameter type.
The "type_name" field
The mandatory field. It takes a string specifying the name of the class where the parameter type is defined.
The "returns" field
The optional field. The return value object consists of a single field:
The "attributes" field
The optional field. The array of strings that specifies the properties of the method return value.
Possible return value attributes
# |
Attribute name |
Attribute description |
---|---|---|
1 |
common_source |
The return value is a source of tainted data |
2 |
web_source |
The return value is a web source of tainted data |
3 |
potential_sql_sanitization |
The return value is sanitized from SQL Injection |
4 |
potential_os_command_sanitization |
The return value is sanitized from OS Command Injection |
5 |
potential_xpath_sanitization |
The return value is sanitized from XPath Injection |
6 |
potential_log_sanitizaton |
The return value is sanitized from Log Injection |
7 |
potential_configuration_sanitization |
The return value is sanitized from Configuration Injection |
8 |
potential_ldap_sanitization |
The return value is sanitized from LDAP Injection |
9 |
potential_reflection_sanitization |
The return value is sanitized from Reflection Injection |
10 |
regex_sanitization |
The return value is sanitized from Regex Injection |
11 |
xss_input_sanitization |
The return value is sanitized from XSS Injection |
12 |
potential_path_traversal_sanitization |
The return value is sanitized from Path Traversal Vulnerability |
JSON Schema
JSON Schema is included in the distribution kit or can be accessed via the link.
Examples
Code annotation
Look at the following method:
package com.example;
public class MyClass {
public string getUserInput() {
....
}
}
Assume this method returns user input that may include tainted data. To inform the analyzer of this, we can use the following annotation:
{
"version": 1,
"language": "java",
"annotations": [
{
"type": "method",
"package_name": "com.example",
"type_name": "MyClass",
"method_name": "getUserInput",
"returns": {
"attributes": [ "common_source" ]
}
}
]
}
Annotation for a method/constructor where the parameter type is disregarded
Here are two overloads of the GetUserInput
method:
package com.example;
public class MyClass {
public string getUserInput(string str) {
....
}
public string getUserInput(int index) {
....
}
}
Assume this method returns user input that may include tainted data, regardless of the parameter type. To inform the analyzer of this, we can use the following annotation:
{
"version": 1,
"language": "java",
"annotations": [
{
"type": "method",
"namespace_name": "com.example",
"type_name": "MyClass",
"method_name": "getUserInput",
"returns": {
"attributes": [ "common_source " ]
}
}
]
}
In this case, there is no annotation for the first parameter. Additionally, when selecting a method annotation, the type of the first parameter is not important, so its annotation is represented by an empty object.
Annotation for a method or constructor with certain parameters
Let's look at methods with different parameters:
package org.example;
public class Sink {
public void sink(String input) {
....
}
public void sink(String input, String sanitization) {
....
}
}
If we consider only the first version of the method to be a sink, its annotation looks as follows:
{
"language": "java",
"version": 1,
"annotations": [
{
"type": "method",
"package": "org.example",
"type_name": "Sink",
"method_name": "sink",
"params": [
{
"package": "java.lang",
"type_name": "String"
}
],
"attributes": [
"sql_injection_sink"
]
}
]
}
The analyzer warns about only one method call in the following code:
package org.example;
public class Main {
public static void main(String[] args) {
var sinkObj = new Sink();
sinkObj.sink(args[0]); // V5309
sinkObj.sink(args[0], "sanitized"); // no warning is issued
}
}