Dirty code: trusted keeper of errors. Broken windows theory

Code smells
Broken windows theory
Solution
Wrap-up

Many people know that code cleanliness affects its maintainability and stability, but what about bugs? In this article, we'll explore real examples to see how dirty code can lead to problems—and how we can address them.

Clean code is a set of rules and principles for writing code that make it easier to read and maintain. However, today we're going to talk about not only this but also other things.

We're exploring how dirty code contributes to software bugs. Who will help us on this journey? The PVS-Studio static analyzer will reveal all the dangers lurking in code.

Code smells

Let's start with code smells that signal the need for refactoring. While there are many types, we'll focus only on the ones that hide bugs.

Big City Life methods

Many code lines don't necessarily mean better code: developers add code but don't decompose it. The excessive length of methods can increase the likelihood of errors.

Let's take a step back and look at a code snippet from Akka.NET:

Yeah, that's a single method... I could paste it here as a text and challenge you to find the error—because there is one. But let's be honest, you'd probably give up before even finishing the sentence.

As I mentioned, there is indeed an error among those 102 lines. Let's examine it:

....
MinNrOfMembersOfRole = clusterConfig.GetConfig("role")
                        .Root.GetObject().Items
                        .ToImmutableDictionary(
                         kv => kv.Key, 
                         kv => kv.Value.GetObject()  
                               .GetKey("....").GetInt()); <=
....

The PVS-Studio warnings:

V3080 [CWE-476] Possible null dereference of method return value. Consider inspecting: GetObject().

The error is quite simple: the GetObject method can return a null value, but we don't check for it. Well, the devs just forgot to add a single ?. However, detecting this error before it occurs is unlikely—the method size almost guarantees that the bug will remain buried under layers of unrefactored code.

Tangled switch statements

This code has a distinct smell of object-oriented design issues. Robert Martin and Casey Muratori have extensively discussed the use of switch in relation to clean code.

Note: We've posted several articles about this discussion. You can read them at this link.

From a clean code perspective, using huge switch instead of polymorphism can lead to issues because maintaining bloated switch or if statements becomes problematic.

Finding errors in such structures—whether large or small—can be equally difficult, and code flaws can be related not only to object-oriented design.

A quite common problem is omitting cases in a switch, for example, when using enum.

Let's take a peek at the real-world example from CUBA Platform:

enum Operation {
    REFRESH,
    CLEAR,
    ADD,
    REMOVE,
    UPDATE
}

@Override
public void setDatasource(final CollectionDatasource datasource) {
  ....
  collectionChangeListener = e -> {
    switch (e.getOperation()) {
      case CLEAR:
      case REFRESH:
        fieldDatasources.clear();
        break;

      case UPDATE:
      case REMOVE:
        for (Object entity : e.getItems()) {
          fieldDatasources.remove(entity);
        }
        break;
    }
  };
  ....
}

The PVS-Studio warning: V6002 The switch statement does not cover all values of the 'Operation' enum: ADD.

We have some enum with variants, which we further enumerate using switch. When changing this enumeration, we must also update all switch statements that apply to it. This wouldn't be as concerning if not for one crucial detail: the switch statement lacks the default branch, so no actions will be performed if the unknown variant is encountered.

The example from .NET 9:

public static void SetAsIConvertible(this ref ComVariant variant,
                                     IConvertible value)
{
  TypeCode tc = value.GetTypeCode();
  CultureInfo ci = CultureInfo.CurrentCulture;

  switch (tc)
  {
    case TypeCode.Empty: break;
    case TypeCode.Object: 
      variant = ComVariant.CreateRaw(....); break;
    case TypeCode.DBNull: 
      variant = ComVariant.Null; break;
    case TypeCode.Boolean: 
      variant = ComVariant.Create<bool>(....)); break;
    case TypeCode.Char: 
      variant = ComVariant.Create<ushort>(value.ToChar(ci)); break;
    case TypeCode.SByte: 
      variant = ComVariant.Create<sbyte>(value.ToSByte(ci)); break;
    case TypeCode.Byte: 
      variant = ComVariant.Create<byte>(value.ToByte(ci)); break;
    case TypeCode.Int16: 
      variant = ComVariant.Create(value.ToInt16(ci)); break;
    case TypeCode.UInt16: 
      variant = ComVariant.Create(value.ToUInt16(ci)); break;
    case TypeCode.Int32: 
      variant = ComVariant.Create(value.ToInt32(ci)); break;
    case TypeCode.UInt32: 
      variant = ComVariant.Create(value.ToUInt32(ci)); break;
    case TypeCode.Int64: 
      variant = ComVariant.Create(value.ToInt64(ci)); break;
    case TypeCode.UInt64: 
      variant = ComVariant.Create(value.ToInt64(ci)); break;
    case TypeCode.Single: 
      variant = ComVariant.Create(value.ToSingle(ci)); break;
    case TypeCode.Double: 
      variant = ComVariant.Create(value.ToDouble(ci)); break;
    case TypeCode.Decimal: 
      variant = ComVariant.Create(value.ToDecimal(ci)); break;
    case TypeCode.DateTime: 
      variant = ComVariant.Create(value.ToDateTime(ci)); break;
    case TypeCode.String: 
      variant = ComVariant.Create(....); break;

    default:
      throw new NotSupportedException();
  }
}

I'd challenge you to find the error, but honestly, it would be such a torturous endeavor—so let's skip the guessing games:

....
case TypeCode.Int64: 
  variant = ComVariant.Create(value.ToInt64(ci)); break;
case TypeCode.UInt64: 
  variant = ComVariant.Create(value.ToInt64(ci)); break; // <=
....

The PVS-Studio warning:

V3139 Two or more case-branches perform the same actions.

If you look closely, you'll notice the ToInt64 method being used for the UInt64 type. It's most likely a copy-paste error. But see how effectively the massive switch statement hides it from view...

Dead code

Dead code is a program fragment that can be executed but won't affect the program execution result in any way. Such code can introduce dead variables—declared but never used.

Let me give you a simple example:

int sum (int a, int b) {
  int result = a + b;
  return a + b;
}

Here, the definition of the result variable appears dead code, created without purpose. As a result, the value isn't used, making the assignment redundant.

Let's take a peek at the example from Universal:

template<typename Scalar>
vector<Scalar> operator*(double scalar, const vector<Scalar>& v) {
  vector<Scalar> scaledVector(v);
  scaledVector *= scalar;
  return v;
}

The PVS-Studio warning: V1001 The 'scaledVector' variable is assigned but is not used by the end of the function.

In this example, the scaledVector is created and even modified within the method, but then... the developer simply forgets about it and returns the v argument. The developer might have intended a different behavior, so the dead code helped us reveal the error here.

Robert Martin extends the concept of dead code to include unreachable code, i.e. the code fragment that will never be executed. This difference in concepts also affects the possible consequences of such a situation in code.

Let's look at the example from Apache Dubbo:

@Override
public NextAction handleRead(FilterChainContext context) throws IOException {
  .... 
  do {
    savedReadIndex = frame.readerIndex();
    try {
      msg = codec.decode(channel, frame);
    } catch (Exception e) {
      previousData = ChannelBuffers.EMPTY_BUFFER;
      throw new IOException(e.getMessage(), e);
    }
    if (msg == Codec2.DecodeResult.NEED_MORE_INPUT) {
      frame.readerIndex(savedReadIndex);
        return context.getStopAction();     // <=
    } else {
      if (savedReadIndex == frame.readerIndex()) {
        previousData = ChannelBuffers.EMPTY_BUFFER;
        throw new IOException("Decode without read data.");   // <=
      }
      if (msg != null) {
        context.setMessage(msg);
        return context.getInvokeAction();     // <=
      } else {
        return context.getInvokeAction();    // <=
      }
    }
  } while (frame.readable());     // <=
  ....
}

The PVS-Studio warning:

V6019 Unreachable code detected. It is possible that an error is present.

In this do ̶ while loop, the method exits within the do block, causing the entire while condition to become redundant. This code fragment illustrates the real case of unreachable code. The issue here isn't just about clean code—it's a fundamental flaw in program logic.

If the while statement was written, it was probably meant to perform some actions, but in this case, it never gets executed.

The real-world example from IntelliJ IDEA Community Edition:

private static void doTest(@Nullable JBCefClient client) {
  ....
  try {
    browser = new JCEFHtmlPanel(client, "about:blank");
  } catch (RuntimeException ex) {
    fail("Exception occurred: " + ex.getMessage());
    ex.printStackTrace();         <=
  }
  ....
}

The PVS-Studio warning: V6019 Unreachable code detected. It is possible that an error is present.

Why does the analyzer issue a warning for this line? Patience, folks, just a little patience. To uncover this mysterious case, we need to peek into the fail method:

public static void fail(String message) {
    if (message == null) {
        throw new AssertionError();
    }
    throw new AssertionError(message);
}

Eureka! The method throws an exception with every call, so in doTest, it'll never reach the ex.printStackTrace method. Even though this problem occurs in a unit test, it would be helpful to see detailed error information when it fails.

Message chains

To describe a method chain, imagine a situation where a client requests an object from another object, which in turn requests yet another, and so on.

This approach comes with several drawbacks:

changing intermediate connections modifies the client as well;
the chain of calls takes more time to read and understand;
it wastes processing power.

I suggest looking at the example of such a chain from ReactOS:

....
w.NumChannels = AUD_BUF->audinfo().channels();
w.SampleRate = AUD_BUF->audinfo().sample_rate();
w.ByteRate = AUD_BUF->audinfo().byte_rate();
w.BlockAlign = AUD_BUF->audinfo().block_align();
w.BitsPerSample = AUD_BUF->audinfo().bits();
....

The PVS-Studio warning: V807 Decreased performance. Consider creating a reference to avoid using the 'AUD_BUF->audinfo()' expression repeatedly.

We repeatedly access the same chain! It could be easily fixed with a simple change:

....
auto audinfo = AUD_BUF->audinfo();

w.NumChannels = audinfo.channels();
w.SampleRate = audinfo.sample_rate();
w.ByteRate = audinfo.byte_rate();
w.BlockAlign = audinfo.block_align();
w.BitsPerSample = audinfo.bits();
....

This is the kind of beauty we found in Tizen:

for (size_t keypointNum = 0u;
    keypointNum < numberOfKeypoints; ++keypointNum)
{
    os << obj.__features.__objectKeypoints[keypointNum].pt.x << ' ';
    os << obj.__features.__objectKeypoints[keypointNum].pt.y << ' ';
    os << obj.__features.__objectKeypoints[keypointNum].size << ' ';
    os << obj.__features.__objectKeypoints[keypointNum].response << ' ';
    os << obj.__features.__objectKeypoints[keypointNum].angle << ' ';
    os << obj.__features.__objectKeypoints[keypointNum].octave << ' ';
    os << obj.__features.__objectKeypoints[keypointNum].class_id << '\n';
}

The PVS-Studio warning: V807 Decreased performance. Consider creating a reference to avoid using the same expression repeatedly.

By performing the same "refactoring ritual" as last time, we'll get this code:

for (....) {
  auto &keypoint = obj.__features.__objectKeypoints[keypointNum];
  os << keypoint.pt.x << ' '
     << keypoint.pt.y << ' '
     << keypoint.size << ' '
     << keypoint.response << ' '
     << keypoint.angle << ' '
     << keypoint.octave << ' '
     << keypoint.class_id << '\n';
}

This code fragment that might have looked scary now looks quite neat, with every shortened line free of unnecessary details.

It's worth noting that the examples provided above had relatively simple solutions, but that's not always the case. However, there are still refactoring options for more complex situations:

Fixing such issues may not seem like a huge optimization but can contribute to the overall performance of a project.

Wow, clean code actually improves performance! Who would've thought?

Broken windows theory

There's a well-known social psychology theory suggesting that even minor violations of public order can signal to others that rules and norms can be ignored. A single infraction can trigger a chain reaction, leading to more serious violations. That's why it's crucial to address small issues proactively and respond to them immediately.

I believe this principle applies equally to software development.

Imagine a failing build in a CI/CD system—one among many. If we ignore it, more failures will accumulate over time, and soon, broken builds will become the norm. "Yeah, my commit broke the build, but look at all the others!"

The same goes for code quality. When we allow seemingly insignificant mistakes, we set a precedent that tells us—and everyone else—that such mistakes are OK. From that moment on, it no longer matters how critical they are.

Look at the example from Keycloak. In the check report, I found many seemingly minor warnings, such as this one:

public KeycloakUriBuilder schemeSpecificPart(String ssp)
  throws IllegalArgumentException {
    if (ssp == null) 
      throw new IllegalArgumentException(....);
    ....
    if (ssp != null)         // <=
      sb.append(ssp);
    ....
}

The PVS-Studio warning: V6007 Expression 'ssp != null' is always true.

It's pretty straightforward: if the ssp variable is null, the method throws an IllegalArgumentException; otherwise, execution continues. In this fragment, ssp != null is redundant.

As it turns out, similar redundant checks appeared frequently throughout the project. But, as you might expect, the analyzer also detects more intriguing issues like these:

public boolean equals(Object o) {
  if (this == o) return true;
  if (o == null || getClass() != o.getClass()) return false;

  Key key = (Key) o;

  if (action != key.action) return false;         // <=
  ....
}

The PVS-Studio warning: V6013 Strings 'action' and 'key.action' are compared by reference. Possibly an equality comparison was intended.

In the warning highlighted by the analyzer, the check doesn't actually compare the contents of two strings—it compares object references. This means the check is faulty, potentially causing the method to behave differently than the developer intended.

Beyond that, when a project accumulates numerous warnings, it's easy to stop taking them seriously. Even worse, we might overlook a critical issue—one that gets lost among redundant checks and unused variables that were never cleaned up.

Solution

Hmm, let's think... what kind of solution could a static analyzer developer possibly suggest, huh?

As you might have guessed, static analysis itself is one of the best ways to prevent the issues described above. Even better, integrating static analysis into development can even nudge us to refactor our code. One of my colleagues recently shared an example of this in another article. In that case, a static analyzer helped spot and remove a redundant comment—one that cluttered the code without adding any real value.

That said, let me be clear: I'm not Morpheus from The Matrix. Static analysis isn't some magical pill that will instantly fix all your code issues just by existing. It's a tool, and it only works if you use it right. That means running it regularly and ensuring that faulty code never reaches production. In a recent article, I talked about setting up regular static analysis.

Wrap-up

C'est fini! If you see things differently, I'd love to hear your thoughts in the comments. I'm always up for a good discussion on this topic.

In this article, I've used examples from various open-source projects, some of which we've covered before. Here's a list of the mentioned articles (in order of their appearance):

Sviatoslav Razmyslov. Analyzing the Code of CUBA Platform with PVS-Studio
Artem Rovenskii. Bugs and suspicious places in .NET 9 source code
Valerii Filatov Authorization pitfalls: what does Keycloak cloak?
Valery Komarov. Analysis of the Apache Dubbo RPC Framework by the PVS-Studio Static Code Analyzer
Kirill Epifanov. Operation K. Looking for bugs in the IntelliJ IDEA code
Taras Shevchenko. Eleven moments of ReactOS: user mode gets better?
Andrey Karpov. 27 000 Errors in the Tizen Operating System

Clean code to you folks!