>
>
Avoid adding a new library to the proje…

Andrey Karpov
Articles: 674

Avoid adding a new library to the project

Suppose, you need to implement an X functionality in your project. Theorists of software development will say that you have to take the already existing library Y, and use it to implement the things you need. Suppose, you need to implement an X functionality in your project. Theorists of software development will say that you have to take the already existing library Y, and use it to implement the things you need. In fact, it is a classic approach in the software development - reusing your own or others' previously created libraries (third-party libraries). And most of the programmers go this way.

However, those theorists in various articles and books, forget to mention what hell it will become to support several dozens of third-party libraries, existing in your project in about 10 years.

I strongly recommend to avoid adding (choking a project with libraries) a new library to a project. Please don't get me wrong. I am not saying that you shouldn't use libraries at all and write everything yourself. This would be stupid, of course. But sometimes, a new library is added to the project at the whim of some developer, intending to add a little cool small "feature" to the project. It's not hard to add a new library to the project. But then the whole team will have to carry the load of its support for many years.

Tracking the evolution of several large projects, I have seen quite a lot of problems caused by a large number of third-party libraries. I will probably enumerate only some of the issues, but this list should already provoke some thoughts:

  • Adding new libraries promptly increases the project size. In our era of fast Internet and large SSD drives, this is not a big problem, of course. But, it's rather unpleasant when the downloading time turns into 10 minutes instead of 1 because of the version control system.
  • Even if you use just 1% of the library capabilities usually it is included in the project as a whole. As a result, if the libraries are used in the form of prepared modules (for example, DLL), the distribution size grows very fast If you use the library as source code, then the compile time significantly increases.
  • Infrastructure connected with the compilation of the project becomes more complicated. Some libraries require additional components. A simple example: we need Python for building. As a result, in some time you'll need to have a whole lot of additional programs to build a project. So the probability that something will not work increases. It's hard to explain it, you should experience it. In big projects something fails all the time and you have to put some effort to make everything work and compile.
  • If you care about vulnerabilities, you must regularly update third-party libraries. It's of interest for violaters to study the code libraries to search for vulnerabilities. Firstly, many libraries are open-source, and secondly, having found a weak point in one of the libraries, you can get a master key to many applications where the library is used.
  • You will have troubles upgrading to a new version of the compiler. There will definitely be a few libraries that won't be ready to adapt for a new compiler. And you'll have to wait or to make your own corrections in the library.
  • You will have problems when moving to a different compiler. For example, you are using Visual C++ and want to use Intel C++. There will surely be a couple of libraries where something is wrong.
  • You will have problems moving to a different platform. Not necessarily even a totally different platform. Let's say, you'll decide to port a Win32 application to Win64. You will have the same problems. Most likely, several libraries won't be ready for this and you will wonder what to do with them. It is especially unpleasant when the library is laying somewhere dormant and is no longer developing.
  • Sooner or later, if you use lots of C libraries, where the types aren't stored in namespace, you'll start having clashes of names. This causes compilation errors or hidden errors. For example, a wrong enum constant can get used insted of the one you've planned.
  • If your project uses a lot of libraries, adding another one won't seem harmful. You can draw an analogy with the broken windows theory. Consequently, the growth of the project turns into uncontrolled chaos.
  • And there could be a lot more other downsides of adding new libraries that I'm probably not aware of. But in any case, additional libraries increase the complexity of project support. Some issues can occur in a fragment where they were least expected to.

Again, I should emphasize. I don't say that we should stop using third-party libraries at all. If we have to work with images in PNG format in the program, we'll take the LibPNG library and not reinvent the wheel.

But even working with PNG we need to stop and think. Do we need a library? What do we want to do with the images? If the task is just to save an image in *.png file, you can get by with system functions. For example, if you have a Windows application, you could use WIC. And if you're already using a MFC library, there is no need to make the code more sophisticated, because there's a CImage class (see the discussion on Stack Overflow). Minus one library-great!

Let me give you an example from my own practice. In the process of developing the PVS-Studio analyzer, we needed to use simple regular expressions in a couple of diagnostics. In general, I am convinced that static analysis isn't a right place for regular expressions. This is an extremely inefficient approach. I even wrote an article regarding this topic. But sometimes you just need to find something in a string with the help of a regular expression.

It was possible to "squeeze in" an existing libraries.

It was clear that all of them would be redundant, but regular expressions were still needed and we had to come up with something.

Absolutely coincidentally, exactly at that moment I was reading a book "Beautiful Code" (ISBN 9780596510046). This book is about simple and elegant solutions. And there I came across an extremely simple implementation of regular expressions. Just a few dozen of strings. And that's it!

I decided to use that implementation in PVS-Studio. And you know what? The abilities of this implementation are still enough for us. And some complex regular expressions are just not necessary for us.

Conclusion Instead of adding a new library we spent half an hour writing a needed functionality. We suppressed the desire to use one more library. And it turned out to be a great decision. As the time proved that we really didn't need that library.

This case really convinced me that the simpler solution, the better. Avoiding adding new libraries (if possible) you make your project simpler.

Probably the readers may be interested to know what was the code for searching regular expressions. We'll type it here from the book.

See how graceful it is. This code was slightly changed when integrating to PVS-Studio, but its essence remains unchanged. So, the code from the book:

// regular expression format
// c Matches any "c" letter
// (dot) Matches any (singular) symbol 
// ^ Matches the beginning of the input string
// $ Matches the end of the input string
// * Match the appearance of the preceding character zero or
// several times

int matchhere(char *regexp, char *text);
int matchstar(int c, char *regexp, char *text);

// match: search for regular expression anywhere in text
int match(char *regexp, char *text)
{
  if (regexp[0] == '^')
    return matchhere(regexp+1, text);
  do { /* must look even if string is empty */
   if (matchhere(regexp, text))
     return 1;
  } while (*text++ != '\0');
  return 0;
}

// matchhere: search for regexp at beginning of text 
int matchhere(char *regexp, char *text)
{
   if (regexp[0] == '\0')
     return 1;
   if (regexp[1] == '*')
     return matchstar(regexp[0], regexp+2, text);

   if (regexp[0] == '$' && regexp[1] == '\0')
     return *text == '\0';
   if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text))
     return matchhere(regexp+1, text+1);
   return 0;
}

// matchstar: search for c*regexp at beginning of text
int matchstar(int c, char *regexp, char *text)
{
  do {   /* * a * matches zero or more instances */
            more instances */
    if (matchhere(regexp, text))
      return 1;
  } while (*text != '\0' && (*text++ == c || c == '.'));
  return 0;
}

Recommendation

Don't hurry adding new libraries to the project. Add only when there is no other way to manage without a library.

Here are some possible workarounds:

  • Have a look if the API of your system or one of the existing libraries already has a required functionality It's a good idea to investigate this question.
  • If you plan to use a small piece of functionality from the library, then it makes sense to implement it yourself. The argument to add a library "just in case" is no good. Almost certainly, this library won't be used much in the future. Programmers sometimes want to have universality that is actually not needed.
  • If there are several libraries to resolve your task, choose the simplest one that meets your requirements. As I have stated before, get rid of the ideas "it's a cool library - i'l take it just in case"
  • Before adding a new library, just sit back and think. May be even take a break, get some coffee, discuss with your colleagues. Perhaps you'll see that you can solve the problem in a completely different way, without using third-party libraries.

P.S. The things I speak about here may not be very acceptable for everybody. For example, my recommendation not to use a portable universal library, but WinAPI. There may arise objections based on the idea that going this way "binds" this project to one operating system. And then it will be very difficult to make a program portable. But I do not agree with it. Quite often the idea "and then we'll port it to a different operating system" exists only in the programmer's mind. Such a task may even be unnecessary for the managers. Another option - the project will kick the bucket due to the complexity and universality of it before gaining popularity and having the necessity to port. Also don't forget about point (7) in the list of problems, given above.