To get a trial key
fill out the form below
Team license
Enterprise license
** By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
* By clicking this button you agree to our Privacy Policy statement

Free PVS-Studio license for Microsoft MVP specialists
** By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
** By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
** By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at

If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

An unusual bug in Lucene.Net

An unusual bug in Lucene.Net

Mar 14 2016

Listening to stories about static analysis, some programmers say that they don't really need it, as their code is entirely covered by unit tests, and that's enough to catch all the bugs. Recently I have found a bug that is theoretically possible to find using unit tests, but if you are not aware that it's there, it's almost unreal to write such a test to check it.



Lucene.Net is a port of the Lucene search engine library, written in C#, and targeted at .NET runtime users. The source code is open and available on the project website

The analyzer managed to detect only 5 suspicious fragments due to the slow pace of development, small size and the fact that the project is widely used in other projects for full-text search [1].

To be honest, I didn't expect to find more bugs. One of these errors seemed especially interesting to me, so I decided to tell our readers about it in our blog.

About the bug found

We have a diagnostic, V3035, about an error when instead of += a programmer may mistakenly write =+, where + is a unary plus. When I was writing it by analogy with the V588 diagnostic, designed for C++, I was thinking - can a programmer really make the same error, coding in C#? It could be understandable in C++ - people use various text editors instead of IDE, and a typo can be easily left unnoticed. But typing text in Visual Studio, which automatically aligns the code once a semicolon is put, is it possible to overlook the misprint? It turns out that it is. Such a bug was found in Lucene.Net. It is of great interest to us, mostly because it's rather hard to detect it using means other than static analysis. Let's take a look at the code:

protected virtual void Substitute( StringBuilder buffer )
    substCount = 0;
    for ( int c = 0; c < buffer.Length; c++ ) 

        // Take care that at least one character
        // is left left side from the current one
        if ( c < buffer.Length - 1 ) 
            // Masking several common character combinations
            // with an token
            if ( ( c < buffer.Length - 2 ) && buffer[c] == 's' &&
                buffer[c + 1] == 'c' && buffer[c + 2] == 'h' )
                buffer[c] = '$';
                buffer.Remove(c + 1, 2);
                substCount =+ 2;
            else if ( buffer[c] == 's' && buffer[c + 1] == 't' ) 
                buffer[c] = '!';
                buffer.Remove(c + 1, 1);

There is also a class GermanStemmer, which cuts off suffixes of german words to mark out a common root. It works in the following way: first, the Substitute method replaces different combinations of letters with other symbols, so that they are not confused with a suffix. There are such substitutions as - 'sch' to '$', 'st' to '!' (you can see it in the code example). At the same time the number of characters by which such changes will shorten the word, is stored in the substCount variable. Further on, the Strip method cuts off extra suffixes and finally, the Resubstitute method does the reverse substitution: '$' to 'sch', '!' to 'st'. For instance, if we have a word "kapitalistischen" (capitalistic), the stemmer will do the following: kapitalistischen => kapitali!i$en (Substitute) => kapitali!i$ (Strip) => kapitalistisch (Resubstitute).

Because of this typo, during the substitution of 'sch' with '$', the substCount variable will be assigned with 2, instead of adding 2 to substCount. This error is really hard to find using methods other than static analysis. That's the answer to those who think "Do I need static analysis, if I have unit-tests?" Thus, to catch such a bug with the help of unit tests one should test Lucene.Net on German texts, using GermanStemmer; the tests should index a word containing the 'sch' combination, and one more letter combination, for which the substitution will be performed. At the same time it should be present in the word before 'sch', so that the substCount will be not zero by the time the expression substCount =+ 2 is executed. Quite an unusual combination for a test, especially if you don't see the bug.


Unit tests and static analysis need not exclude, but rather complement, each other as methods of software development [2]. I suggest downloading PVS-Studio static analyzer, and finding those bugs that weren't detected by means of unit-testing.

Additional links

Popular related articles
Sorting in C#: OrderBy.OrderBy or OrderBy.ThenBy? What's more effective and why?

Date: Sep 20 2022

Author: Sergey Vasiliev

Suppose we need to sort the collection by multiple keys. In C#, we can do this with the help of OrderBy().OrderBy() or OrderBy().ThenBy(). But what is the difference between these calls? To answer th…
ML.NET: can Microsoft's machine learning be trusted?

Date: Sep 08 2022

Author: Andrey Moskalev

In 2018, Microsoft created ML.NET, a machine learning framework for .NET developers. Since then, the machine learning library has undergone significant changes and acquired new features to identify p…
The risks of using vulnerable dependencies in your project, and how SCA helps manage them

Date: Sep 06 2022

Author: Nikita Lipilin

Most applications today use third-party libraries. If such a library contains a vulnerability, an app that uses this library may also be vulnerable. But how can you identify such problematic dependen…
Build to order? Checking MSBuild for the second time

Date: Sep 01 2022

Author: Nikita Panevin

MSBuild is a popular open-source build platform created by Microsoft. Developers all over the world use MSBuild. In 2016, we checked it for the first time and found several suspicious places. Can we …
The Orchard Core threequel. Rechecking the project with PVS-Studio

Date: Aug 25 2022

Author: Aleksey Avdeev

In this article, we check the Orchard Core project with the help of the PVS-Studio static analyzer. We are going to find out if the platform code is as good as the sites created on its basis. May the…

Comments (0)

Next comments
Unicorn with delicious cookie
Our website uses cookies to enhance your browsing experience. Would you like to learn more?