Webinar: Parsing C++ - 10.10
In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've come, and what is in our further plans. Of course, this article will list first analyzer trials on open source projects.
Here is a brief description of PVS-Studio for Java developers who have not previously heard of it.
This tool is designed to detect errors and potential vulnerabilities in the source code of programs, written in C, C++, C#, and Java. It works in Windows, Linux, and macOS environment.
PVS-Studio performs static code analysis and generates a report that helps a developer find and eliminate defects. For those who are interested in how exactly PVS-Studio searches for errors, I suggest having a look at the article "Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities".
I could have come up with a clever story of how we've been speculating about what next language to support in PVS-Studio. About a sensible choice of Java, which is based on a high popularity of this language, and so on.
However, as it happens in life, the choice was made not by deep analysis, but by an experiment :). Yes, we pondered the direction of the PVS-Studio analyzer further development. We considered such languages, as: Java, PHP, Python, JavaScript, IBM RPG. We were even inclined to the Java language but the final choice wasn't made. For those whose glance rested on unfamiliar IBM RPG, I'd like to direct you to this note, from which everything will become clear.
At the end of 2017, my colleague Egor Bredikhin reviewed off-the-shelf libraries of parsing code (in other words - parsers) for new development directions, interesting for us. Eventually, he came across several projects for parsing Java code. He managed to quickly make an analyzer prototype with a couple of diagnostics based on Spoon. Moreover, it has become clear that we would be able to use in the Java analyzer some mechanisms of the C++ analyzer using SWIG. We looked at what we'd got and realized that our next analyzer would be for Java.
We'd like to thank Egor for his undertaking and hard work he has done over the Java analyzer. The development process itself was described by him in the article "Development of a new static analyzer: PVS-Studio Java".https://www.viva64.com/en/b/0572/
There are many free and commercial static code analyzers for Java across the globe. There is no point to list them all in the article. I'll just leave the link to the "List of tools for static code analysis" (see the Java and Multi-language section).
However, I know that first and foremost we'll be asked about IntelliJ IDEA, FindBugs, and SonarQube (SonarJava).
IntelliJ IDEA
A very powerful static code analyzer is built in IntelliJ IDEA. What is more, the analyzer is evolving, its authors closely follow our activities. So, IntelliJ IDEA is a tough cookie for us. We'll not be able to surpass IntelliJ IDEA in diagnostic abilities, at least for now. Therefore, we will concentrate on our other advantages.
Static analysis in IntelliJ IDEA is primarily one of the features of the environment, which imposes certain limitations on it. As for us, we have freedom in what we can do with our analyzer. For example, we can quickly adapt it to specific customer needs. Fast and deep support is our competitive advantage. Our clients directly communicate with developers, working on one or another part of PVS-Studio.
In PVS-Studio, there are many opportunities to integrate it into a cycle of developing large old projects. For example, it's our integration with SonarQube. It also includes mass suppression of analyzer warnings, which allows you to immediately start using the tool in a large project for tracking bugs only in the new or modified code. PVS-Studio can be built in a continuous integration process. I think these and other features will help our analyzer to find a place under the Sun in the Java world.
FindBugs
The FindBugs project is abandoned. Nevertheless, we should mention it by reason of the fact that, perhaps, it is the most famous free static analyzer of Java code.
SpotBugs might be called the successor of FindBugs. However, it is less popular, and it is not clear yet what will happen with it.
Generally speaking, we think that even though FindBugs has been and still remains extremely popular, and in addition to that a free analyzer, we shouldn't dwell on it. This project will just quietly become a history.
P.S. By the way, now PVS-Studio can also be used for free when working with open projects.
SonarQube (SonarJava)
We believe that we don't compete with SonarQube, but complement it. PVS-Studio integrates in SonarQube, which allows developers to find more bugs and potential security vulnerabilities in their projects. We regularly tell how to integrate the PVS-Studio tool and other analyzers in SonarQube on master classes which we hold in terms of different conferences.
We made available the most popular ways of the analyzer integration in the build system for the users:
During the testing phase, we met lots of users who have self-written build systems, especially in the sphere of mobile development. They enjoyed the opportunity to run the analyzer directly, listing the sources and classpath.
You can find detailed information about all the ways to run the analyzer on the documentation page "How to Run PVS-Studio Java".
We could not shy away from the SonarQube platform of code quality control, which is so popular among Java developers, so we added support of the Java language in our plugin for SonarQube.
We have lots of ideas that might require further investigation, but some specific plans, inherent for any of our analyzers, are as follows:
Perhaps, we will find time to adapt the IntelliJ IDEA plugin for CLion. Hi to C++ developers who read about the Java analyzer:-)
Shiver my timbers if I'm not showing in the article some bugs found with the new analyzer! Well, we could have taken a big open source Java-project and write a classic article reviewing errors, as we usually do.
However, I immediately anticipate questions about what we can find in projects such as IntelliJ IDEA, FindBugs, and so on. So I just have no way out, other than start with these projects. So, I decided to quickly check and write out several interesting examples of errors from the following projects:
Writing about bugs of these projects is a challenge. The fact of the matter is that these projects are of very high quality. Actually, it's not surprising. Our observations show that static code analyzers are always well tested and verified using other tools.
Despite all this, I'll have to start exactly with these projects. I won't have the second chance to write about them. I'm sure that after the release of PVS-Studio for Java, developers of the projects listed will take PVS-Studio on board and begin using it for regular or, at least, occasional checks of their code. For example, I know that Tagir Valeev, one of the developers of JetBrains, working on the IntelliJ IDEA static code analyzer, at the moment, when I'm writing the article is already playing with the Beta version of PVS-Studio. He wrote to us about 15 e-mails with bug reports and recommendations. Thanks, Tagir!
Fortunately, I don't need to find as many bugs in one particular project. At the moment my task is to show that the PVS-Studio analyzer for Java appeared not in vain, and will be able to fill up a line of other tools designed to improve code quality. I just looked through the analyzer reports and listed some errors that seemed interesting. If possible, I tried to cite different types of errors. Let's see how it turned out.
private static boolean checkSentenceCapitalization(@NotNull String value) {
List<String> words = StringUtil.split(value, " ");
....
int capitalized = 1;
....
return capitalized / words.size() < 0.2; // allow reasonable amount of
// capitalized words
}
PVS-Studio warning: V6011 [CWE-682] The '0.2' literal of the 'double' type is compared to a value of the 'int' type. TitleCapitalizationInspection.java 169
The point was that the function should return true if less than 20% of the words begin with a capital letter. Actually, the check is not working, because integer division occurs. As a result of division we can obtain only two values: 0 or 1.
The function will return false, only if all words begin with a capital letter. In all other cases, division operation will result in 0 and the function will return true.
public int findPreviousIndex(int current) {
int count = myPainter.getErrorStripeCount();
int foundIndex = -1;
int foundLayer = 0;
if (0 <= current && current < count) {
current--;
for (int index = count - 1; index >= 0; index++) { // <=
int layer = getLayer(index);
if (layer > foundLayer) {
foundIndex = index;
foundLayer = layer;
}
}
....
}
PVS-Studio warning: V6007 [CWE-571] Expression 'index >= 0' is always true. Updater.java 184
First, look at the condition (0 <= current && current < count). It is executed only in case if the count variable value is greater than 0.
Now look at the loop:
for (int index = count - 1; index >= 0; index++)
The variable index is initialized with an expression count - 1. As the count variable is greater than 0, the initial value of the index variable will always be greater or equal to 0. It turns out that the loop will be executed until an overflow of the index variable occurs.
Most likely, it's just a typo, and a decrement, not an increment of a variable has to be executed:
for (int index = count - 1; index >= 0; index--)
@NonNls public static final String BEFORE_STR_OLD = "before:";
@NonNls public static final String AFTER_STR_OLD = "after:";
private static boolean isBeforeOrAfterKeyword(String str, boolean trimKeyword) {
return (trimKeyword ? LoadingOrder.BEFORE_STR.trim() :
LoadingOrder.BEFORE_STR).equalsIgnoreCase(str) ||
(trimKeyword ? LoadingOrder.AFTER_STR.trim() :
LoadingOrder.AFTER_STR).equalsIgnoreCase(str) ||
LoadingOrder.BEFORE_STR_OLD.equalsIgnoreCase(str) || // <=
LoadingOrder.BEFORE_STR_OLD.equalsIgnoreCase(str); // <=
}
PVS-Studio warning: V6001 [CWE-570] There are identical sub-expressions 'LoadingOrder.BEFORE_STR_OLD.equalsIgnoreCase(str)' to the left and to the right of the '||' operator. Check lines: 127, 128. ExtensionOrderConverter.java 127
Good old effect of the last line. A developer jumped the gun and having multiplied the line of code, forgot to fix it. As a result, a str string is compared with BEFORE_STR_OLD twice. Most likely, one of the comparisons must be with AFTER_STR_OLD.
public synchronized boolean isIdentifier(@NotNull String name,
final Project project) {
if (!StringUtil.startsWithChar(name,'\'') &&
!StringUtil.startsWithChar(name,'\"')) {
name = "\"" + name;
}
if (!StringUtil.endsWithChar(name,'"') &&
!StringUtil.endsWithChar(name,'\"')) {
name += "\"";
}
....
}
PVS-Studio warning: V6001 [CWE-571] There are identical sub-expressions '!StringUtil.endsWithChar(name,'"')' to the left and to the right of the '&&' operator. JsonNamesValidator.java 27
This code fragment checks that the name is enclosed in either single or double quotation marks. If it's not so, double quotation marks are added automatically.
Due to a typo, the end of the name is checked only for the presence of double quotation marks. As a result, the name in single quotation marks will be processed incorrectly.
The name
'Abcd'
due to adding extra double quotes will turn into:
'Abcd'"
static Context parse(....) {
....
for (int i = offset; i < endOffset; i++) {
char c = text.charAt(i);
if (c == '<' && i < endOffset && text.charAt(i + 1) == '/'
&& startTag != null
&& CharArrayUtil.regionMatches(text, i + 2, endOffset, startTag))
{
endTagStartOffset = i;
break;
}
}
....
}
PVS-Studio warning: V6007 [CWE-571] Expression 'i < endOffset' is always true. EnterAfterJavadocTagHandler.java 183
The subexpression i < endOffset in the condition of the if operator doesn't make sense. The i variable is always less than endOffset in any case, which follows from the condition of the loop execution.
Most likely, a developer wanted to protect from a string overrun when calling functions:
In this case, the subexpression for checking the index must be: (i) < endOffset-2.
public static String generateWarningMessage(....)
{
....
if (buffer.length() > 0) {
if (buffer.length() > 0) {
buffer.append(" ").append(
IdeBundle.message("prompt.delete.and")).append(" ");
}
}
....
}
PVS-Studio warning: V6007 [CWE-571] Expression 'buffer.length() > 0' is always true. DeleteUtil.java 62
This can be either an innocuous redundant code or a crucial error.
If a duplicate check appeared accidentally, e.g. during refactoring, there's nothing wrong with that. You can simply delete the second check.
Another scenario is also possible. The second check must be quite different and the code behaves not as intended. Then it's a real error.
Note. By the way, there are plenty of various redundant checks. Well, often it is clear that it is not an error. However, we cannot consider the analyzer warnings as false positives. For an explanation, I'd like to cite such an example, also taken from IntelliJ IDEA:
private static boolean isMultiline(PsiElement element) {
String text = element.getText();
return text.contains("\n") || text.contains("\r") || text.contains("\r\n");
}
The analyzer says that the function text.contains("\r\n") always returns false. Indeed, if the character "\n" and "\r" isn't found, there is no point to search for "\r\n". It's not a bug, and the code is bad only because it works slightly slower, performing a meaningless search for a substring.
How to deal with such code, in each case, is a question for developers. When writing articles, I usually don't pay attention to such code.
public boolean satisfiedBy(@NotNull PsiElement element) {
....
@NonNls final String text = expression.getText().replaceAll("_", "");
if (text == null || text.length() < 2) {
return false;
}
if ("0".equals(text) || "0L".equals(text) || "0l".equals(text)) {
return false;
}
return text.charAt(0) == '0';
}
PVS-Studio warning: V6007 [CWE-570] Expression '"0".equals(text)' is always false. ConvertIntegerToDecimalPredicate.java 46
The code contains a logical error for sure. I find it difficult to say what exactly the programmer wanted to check and how to correct the defect. So here, I'll just point at a meaningless check.
At the beginning it has to be checked that the string contains at least two symbols. If it isn't so, then the function returns false.
Next comes the check "0".equals(text). It is meaningless, because no string can contain only one character.
So, something is wrong here, and the code should be fixed.
public static String getXMLType(@WillNotClose InputStream in) throws IOException
{
....
String s;
int count = 0;
while (count < 4) {
s = r.readLine();
if (s == null) {
break;
}
Matcher m = tag.matcher(s);
if (m.find()) {
return m.group(1);
}
}
throw new IOException("Didn't find xml tag");
....
}
PVS-Studio warning: V6007 [CWE-571] Expression 'count < 4' is always true. Util.java 394
In theory, a search of the xml tag must be carried out only in the first four lines of the file. But due to the fact that one forgot to increment the count variable, the entire file will be read.
Firstly, this can be a very slow operation, and secondly, somewhere in the middle of the file, something might be found that would be perceived as an xml tag, not being it.
private void reportBug() {
int priority = LOW_PRIORITY;
String pattern = "NS_NON_SHORT_CIRCUIT";
if (sawDangerOld) {
if (sawNullTestVeryOld) {
priority = HIGH_PRIORITY; // <=
}
if (sawMethodCallOld || sawNumericTestVeryOld && sawArrayDangerOld) {
priority = HIGH_PRIORITY; // <=
pattern = "NS_DANGEROUS_NON_SHORT_CIRCUIT";
} else {
priority = NORMAL_PRIORITY; // <=
}
}
bugAccumulator.accumulateBug(
new BugInstance(this, pattern, priority).addClassAndMethod(this), this);
}
PVS-Studio warning: V6021 [CWE-563] The value is assigned to the 'priority' variable but is not used. FindNonShortCircuit.java 197
The value of the priority variable is set depending on the value of the variable sawNullTestVeryOld. However, it doesn't matter at all. After that, the priority variable will be assigned another value in any case. An obvious error in the function's logic.
public class RuleDto {
....
private final RuleDefinitionDto definition;
private final RuleMetadataDto metadata;
....
private void setUpdatedAtFromDefinition(@Nullable Long updatedAt) {
if (updatedAt != null && updatedAt > definition.getUpdatedAt()) {
setUpdatedAt(updatedAt);
}
}
private void setUpdatedAtFromMetadata(@Nullable Long updatedAt) {
if (updatedAt != null && updatedAt > definition.getUpdatedAt()) {
setUpdatedAt(updatedAt);
}
}
....
}
PVS-Studio: V6032 It is odd that the body of method 'setUpdatedAtFromDefinition' is fully equivalent to the body of another method 'setUpdatedAtFromMetadata'. Check lines: 396, 405. RuleDto.java 396
A definition field is used in the method setUpdatedAtFromMetadata. Most likely, the metadata field should be used. This is very similar to the effects of a failed Copy-Paste.
private final Map<JavaPunctuator, Tree.Kind> assignmentOperators =
Maps.newEnumMap(JavaPunctuator.class);
public KindMaps() {
....
assignmentOperators.put(JavaPunctuator.PLUSEQU, Tree.Kind.PLUS_ASSIGNMENT);
....
assignmentOperators.put(JavaPunctuator.PLUSEQU, Tree.Kind.PLUS_ASSIGNMENT);
....
}
PVS-Studio warning: V6033 [CWE-462] An item with the same key 'JavaPunctuator.PLUSEQU' has already been added. Check lines: 104, 100. KindMaps.java 104
The same key-value pair is set in map twice. Most likely, it happened inadvertently and actually there is no real error. However, this code has to be checked in any case, because, perhaps, one forgot to add any other pair.
Why write conclusion when it's so obvious?! I suggest all of you download PVS-Studio right now and try checking your work projects on the Java language! Download PVS-Studio.
Thank you all for your attention. I hope that soon we will please our readers with a series of articles on checking various open source Java projects.
0