Collecting, processing, and transferring data are key processes in IT. What if they break due to some tricky bugs in the code, though? In this article, we'll talk about errors detected by a static analyzer in the Apache NiFi project.
Apache NiFi is a free, open-source tool for collecting, processing, and moving data. It was initially developed by the US National Security Agency. In 2014, it became open source and entered the Apache ecosystem. NiFi is commonly used in big data projects today, especially in combination with Hadoop.
Note. We have already checked the Apache Hadoop project. You can read this article on our blog.
The main advantage of NiFi is its simple visual interface, where data can be moved around like on a diagram: you drag and drop blocks (processors) and connect them with arrows. This allows administrators, analysts, and developers to configure data flows without requiring in-depth knowledge of the system. NiFi can work with various sources: SFTP files, syslog logs, databases (via JDBC), Kafka, HDFS, Elasticsearch, and many others.
Today, we'll take a look at some interesting bugs that PVS-Studio static analyzer detected in this project. We used the d924680 commit to check the project.
By the way, NiFi already uses a code quality control tool called PMD. Let's see whether PVS-Studio can detect errors in code that has already been checked :)
Since the project uses the Maven build system, we used the PVS-Studio plugin to analyze the code.
We had to add a few lines to the pom.xml
file to analyze the source code. First, we enabled the PVS-Studio plugin repository.
....
<pluginRepositories>
<pluginRepository>
<id>pvsstudio-maven-repo</id>
<url>https://wcdn.pvs-studio.com/java/pvsstudio-maven-repository/</url>
</pluginRepository>
</pluginRepositories>
....
Then, we applied the plugin and configured the settings necessary for the analysis:
....
<pluginManagement>
<plugins>
<plugin>
<groupId>com.pvsstudio</groupId>
<artifactId>pvsstudio-maven-plugin</artifactId>
<version>7.38.96564</version>
<configuration>
<analyzer>
<outputType>json</outputType>
<outputFile>PVS-Studio.json</outputFile>
<analysisMode>GA,OWASP</analysisMode>
</analyzer>
</configuration>
</plugin>
....
</plugins>
</pluginManagement>
....
We used the following settings in the analyzer configuration:
<outputType>
is the format the analyzer uses to save the analysis results (in our case, it's a report in json format);<outputFile>
is the path where the analyzer report is located (in our case, it's the root project directory).<analysisMode>
are enabled diagnostic rule groups for the project analysis (in our case, these are rules from the General Analysis and OWASP groups).After making changes to pom.xml
and building the project, all that's left to do is run the following command:
mvn pvsstudio:pvsAnalyze
Note. For more information about the PVS-Studio plugin for the Maven build system, please refer to the documentation.
Analyzer warnings issued for logically incorrect expressions are common in articles on checking open-source projects. Why should this one be any different? :)
Fragment 1
public void communicate() throws IOException {
final String line = reader.readLine();
final String[] splits = line.split(" ");
if (splits.length < 0) {
throw new IOException(....);
....
}
The PVS-Studio warning: V6007 Expression 'splits.length < 0' is always false. BootstrapCodec.java 45
Buckle up! We're about to cross the boundaries of time and space. In this code fragment, an exception is thrown if the length of the resulting array is... less than zero.
The developers probably wanted to check whether the array length is zero, but this scenario will also never happen... Why?
First, we get a string from the user via reader
, and then split it with spaces using line.split()
. The thing is, even if the string has no spaces, String.split()
will return an array that contains only the original string.
Fragment 2
protected Map<String, String> getAttributes(
final TarArchiveInputStream stream
) throws IOException {
....
for (final Entry<Object, Object> entry : props.entrySet()) {
final Object keyObject = entry.getKey();
final Object valueObject = entry.getValue();
if (!(keyObject instanceof String)) { // <=
throw new IOException(
"Flow file attributes object contains key of type "
+ keyObject.getClass().getCanonicalName()
+ " but expected java.lang.String");
} else if (!(keyObject instanceof String)) { // <=
throw new IOException(
"Flow file attributes object contains value of type "
+ keyObject.getClass().getCanonicalName()
+ " but expected java.lang.String");
}
final String key = (String) keyObject;
final String value = (String) valueObject;
result.put(key, value);
}
}
The PVS-Studio warning: V6003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. FlowFileUnpackagerV1.java 80
Have you ever felt hopeless?
In this fragment, the analyzer indicates that both conditions are identical. However, if we look closer, we'll see that not only the conditions are the same but also the actions performed under these conditions.
What happened? The developers wrote the first block of this branch, copied it, and pasted into the else if
statement. However, they forgot to change keyObject
to valueObject
. Moreover, they forgot to do it everywhere, including the message returned by IOException
.
Fragment 3
public void setOutputPorts(
final Set<RemoteProcessGroupPortDescriptor> ports,
final boolean pruneUnusedPorts
) {
....
final Iterator<StandardRemoteGroupPort> itr = outputPorts.values().iterator();
int prunedCount = 0;
while (itr.hasNext()) {
final StandardRemoteGroupPort port = itr.next();
if (....) {
port.setTargetExists(false);
port.setTargetRunning(false);
// If port has connections,
// it will be cleaned up
// when connections are removed
if (port.getConnections().isEmpty()) {
itr.remove();
logger.info(
"Pruning unused Output Port {} from {}", port, this
);
}
}
}
if (prunedCount == 0) { // <=
logger.debug(
"There were no Output Ports to prune from {}",
this
);
} else {....}
}
....
}
The PVS-Studio warning: V6007 Expression 'prunedCount == 0' is always true. StandardRemoteProcessGroup.java 640
The developers' comments help us understand why the analyzer issues a warning for this code snippet.
Ports with no active connections are being cleaned up here. There's also a prunedCount
counter that should count the number of deleted ports.
The analyzer notes that the prunedCount == 0
condition is always true. That's correct because the counter doesn't increase anywhere; it simply remains unchanged throughout the entire code.
However, if we look at the logic and comments, we can see that the port is deleted after the itr.remove()
line. This is even written in the log. So, this is where the prunedCount
counter should be increased. Most likely, they simply forgot to add it.
Have you ever caught an exception while handling an exception?
Fragment 4
public void revertReceivedTo(Relationship r, Throwable t) {
....
String errorMessage = Throwables.getMessage(t, null, 950);
String stackTrace = Throwables.stringStackTrace(t);
for (FlowFile f : toFail) {
if (t != null && r != null) {
....
}
....
}
The PVS-Studio warning: V6060 The 't' reference was utilized before it was verified against null. ProcessSessionWrap.java 210
Here, the analyzer points out that the t
object was used before it was checked for null
. To make sure this object is indeed being dereferenced, let's take a look at the Throwables.stringStackTrace()
method:
public static String stringStackTrace(Throwable e) {
StringWriter sw = new StringWriter(500);
PrintWriter pw = new PrintWriter(sw);
e.printStackTrace(pw);
pw.flush();
sw.flush();
return sw.toString();
}
Here, the object we pass is actually being dereferenced. In other words, when attempting to handle an exception object, we may encounter an exception! Or can we?
In reality, the t != null
check in this code snippet is just redundant. A constant created using new
is passed in the only method call for which the analyzer issued a warning:
public class ExecuteGroovyScript extends AbstractProcessor {
....
public static final Relationship REL_FAILURE = new Relationship
.Builder()
.name("failure")
.description("FlowFiles that failed to be processed")
.build();
....
public void onTrigger(
final ProcessContext context,
final ProcessSession _session
) throws ProcessException {
....
if (toFailureOnError) {
session.revertReceivedTo(
REL_FAILURE,
StackTraceUtils.deepSanitize(t)
);
} else {....}
....
}
So, the analyzer just detected a flaw in the method's logic. This condition is really confusing. We didn't spend so much time digging into it for nothing :)
Fragment 5
public void finishTransferFlowFiles(
final CommunicationsSession commSession
) throws IOException {
if (postResult == null) {
new IllegalStateException(....);
}
....
}
The PVS-Studio warning: V5303 The object was created but it is not being used. The 'throw' keyword could be missing. SiteToSiteRestApiClient.java 919
A three-pointer!
Devs created an exception object containing all the necessary information. It seems like everything is ready—they could've thrown it but... oh. They created it and then just forgot to issue it...
It's so hard to look at an Exception you can't throw...
© An unpublished collection of programmer quotes
Let's not overlook another tradition: null
dereference errors.
Fragment 6
private boolean replaceNodeStatus(
final NodeIdentifier nodeId,
final NodeConnectionStatus currentStatus,
final NodeConnectionStatus newStatus
) {
if (newStatus == null) { // <=
logger.error("....", nodeId, currentStatus, newStatus);
logger.error("", new NullPointerException());
}
....
if (newStatus.getState() == // <=
NodeConnectionState.REMOVED) {
....
} else {....}
}
The PVS-Studio warning: V6008 Potential null dereference of 'newStatus'. NodeClusterCoordinator.java 448
So, what's going on here? If the newStatus
variable is null
, we log it as "Everything is under control. Noted," and then simply move on.
But what happens when "move on"? Just a few lines down, this very newStatus
is dereferenced, regardless of whether it's null or not. The developers may simply have forgotten about return
in the condition.
Fragment 7
public void addFileStatus(
final FileStatus parent,
final FileStatus child
) {
Set<FileStatus> children = fileStatuses.computeIfAbsent(
parent.getPath(), k -> new HashSet<>()
);
if (child != null) { // <=
children.add(child);
if (
child.isDirectory() &&
!fileStatuses.containsKey(child.getPath())
) {
fileStatuses.put(child.getPath(), new HashSet<>());
}
}
pathToStatus.put(parent.getPath(), parent);
pathToStatus.put(child.getPath(), child); // <=
}
The PVS-Studio warning: V6008 Potential null dereference of 'child'. MockFileSystem.java 265
Here's another classic example. First, developers checked that child
wasn't null
, and then they dereferenced it outside the condition.
As in the previous example, such tricks result in a NullPointerException
if child
is equal to null
.
We found a lot of warnings issued by PVS-Studio analyzer in Apache NiFi for cases where something went wrong when comparing objects.
Fragment 8
public void onTrigger(
final ProcessContext context,
final ProcessSession session
) {
final String listingStrategy = context.getProperty(
LISTING_STRATEGY
).getValue();
if (BY_TIMESTAMPS.equals(listingStrategy)) { // <=
listByTrackingTimestamps(context, session);
} else if (BY_ENTITIES.equals(listingStrategy)) { // <=
listByTrackingEntities(context, session);
} else if (NO_TRACKING.equals(listingStrategy)) { // <=
listNoTracking(context, session);
} else {
throw new ProcessException(....);
}
}
The PVS-Studio warnings:
In this fragment, the analyzer warns that the devs compared two incompatible types: AllowableValue
and String
. Out of context, it sounds pretty scary, but it's actually simple. The AllowableValue
type has the getValue()
method that returns a string value suitable for comparison in this case:
public class AllowableValue implements DescribedValue {
private final String value;
....
public String getValue() {
return this.value;
}
}
So, they just needed to add a call to this method before calling equals()
:
....
if (BY_TIMESTAMPS.getValue().equals(listingStrategy)) {
listByTrackingTimestamps(context, session);
} else if (BY_ENTITIES.getValue().equals(listingStrategy)) {
listByTrackingEntities(context, session);
} else if (NO_TRACKING.getValue().equals(listingStrategy)) {
listNoTracking(context, session);
} else {
throw new ProcessException(....);
}
....
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. AbstractListProcessor.java 419
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. AbstractListProcessor.java 422
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. AbstractListProcessor.java 425
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. AbstractListProcessor.java 428
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. ListS3.java 484
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. ListS3.java 486
V6058 The 'equals' function compares objects of incompatible types: AllowableValue, String. ListS3.java 488
Fragment 9
public synchronized void addFlowSnapshot(
final VersionedExternalFlow versionedExternalFlow
) {
final String version;
if (metadata == null) {
bucketId = DEFAULT_BUCKET_ID;
flowId = "flow-" + flowIdGenerator.getAndIncrement();
version = "1";
} else {
bucketId = metadata.getBucketIdentifier();
flowId = metadata.getFlowIdentifier();
version = metadata.getVersion();
}
....
final Optional<VersionedExternalFlow> optionalSnapshot =
snapshots.stream().filter(
snapshot ->
snapshot.getMetadata().getVersion() == version // <=
).findAny();
....
}
The PVS-Studio warning: V6013 Strings 'snapshot.getMetadata().getVersion()' and 'version' are compared by reference. Possibly an equality comparison was intended. InMemoryFlowRegistry.java 165
This fragment contains the following lines: snapshot.getMetadata().getVersion()
and version
. They're compared using the ==
operator. This leads to a situation where the references to these objects are compared instead of their contents. So, this condition is true only if both references point to the same object.
This error is also tricky because it may not always be reproducible due to the String Pool.
To fix the issue and compare strings by content, we'll use the equals()
method instead of the ==
operator.
Fragment 10
public boolean equals(final Object o) {
if (this == o) {
return true;
}
if (o == null
|| getClass() != o.getClass()
|| !Arrays.equals(
o.getClass().getGenericInterfaces(), // <=
o.getClass().getGenericInterfaces() // <=
)
) {
return false;
}
....
}
The PVS-Studio warning: V6009 Function 'equals' receives an odd argument. The 'o.getClass().getGenericInterfaces()' argument was passed several times. EqualsWrapper.java 126
There's an error when using equals()
inside the override of equals()
. C-C-C-COMBO!
The devs call equals()
here to compare two arrays, but... the treacherous copy-paste ruins everything. In this equals
override, they compare the current object with the one passed to the method. In reality, however, they're comparing the results of executing getClass().getGenericInterfaces()
for the same object.
So, it's just a typo that might be fixed like this:
public boolean equals(final Object o) {
if (this == o) {
return true;
}
if (o == null
|| getClass() != o.getClass()
|| !Arrays.equals(
getGenericInterfaces(),
o.getClass().getGenericInterfaces()
)
) {
return false;
}
....
}
Fragment 11
static Properties load(File file, String propertiesType) {
....
try (
InputStream inputStream = new BufferedInputStream(
new FileInputStream(file)
)
) {
rawProperties.load(inputStream);
} catch (Exception e) {
throw new RuntimeException(
String.format(
"Loading {} Properties [%s] failed",
propertiesType,
file
), e);
}
....
}
The PVS-Studio warning: V6046 Incorrect format. A different number of format items is expected. Arguments not used: 2. PropertiesLoader.java 43
The developers passed more arguments than could be substituted into a string when calling String.format()
. But wait! The string has two spaces for substitution!
That's true, but the developers mixed the syntax for String.format()
and the logger in this code snippet in an interesting way. This means that one of the arguments in the String.format()
call always remains empty.
V6046 Incorrect format. A different number of format items is expected. Arguments not used: 2. PropertiesLoader.java 34
V6046 Incorrect format. A different number of format items is expected. Arguments not used: 1. PropertiesPersister.java 110
So, we're almost at the end of the article. We'll create an issue in the GitHub repository to notify the project developers of all errors detected by the analyzer.
Each of our blog articles about checking projects confirms that everyone makes mistakes and, most importantly, that there's nothing wrong with that. The most crucial part is to promptly identify and address these issues.
So, if the examples of bugs detected by PVS-Studio in Apache NiFi caught your attention, you can analyze your own project for free by getting a license here.
Clean code to you folks!
0