Bugs Found by LibreOffice in PVS-Studio
- About the analyzer
- Memory leak
- Long analysis
- Analysis complications
- Old path format
- Unlucky serialization
Usually we check various projects by PVS-Studio. This time, it's been vice versa: We have checked PVS-Studio by LibreOffice :-). And then managed to do the opposite check as well.
Our articles about project checks evoke different reactions from the readers: from "Aren't you bored advertising your tool already?" to "Thank you very much! PVS-Studio is really a great tool!" For justice's sake, I would like to notice that no advertising managers ever take part in the project check, it's only the PVS-Studio developers and translator who do the job. So our contribution to the open-source community is real and really sensible. Developers do not always show interest in maintaining the feedback but they do fix the bugs we report to them in emails. By the example of the LibreOffice project's check, the article about which will soon be published, I'd like to tell you about how our checks influence the analyzer itself and about the work we have done.
About the analyzer
PVS-Studio is a static analyzer detecting errors in the source code of programs in C/C++. Its usage and integration capabilities are constantly evolving, so, besides the demonstration purposes, open-source projects serve as impartial testers for our analyzer.
The LibreOffice project turned out to be a good test for the analyzer and made everyone in the PVS-Studio team spend some effort to resolve the problems revealed by the analysis.
Now I'll tell you about the problems we were faced by when running that check.
LibreOffice is built with MS Visual C++ 2013 in Cygwin. Not so long ago, the PVS-Studio Standalone utility acquired the ability to check any projects. Regardless of the specifics of the present build system, you can now simply enable the "Compiler Monitoring" option and start the project build. To learn more about this feature, see the article PVS-Studio Now Supports Any Build System under Windows and Any Compiler. Easy and Right Out of the Box. To put it short, the utility can extract from the processes running under Windows all the information necessary for starting the analysis process in the same environment. So, when running a project build, a few hundreds of Kbytes of unmanaged memory are allocated for storing the launch command line, current folder, environment variables and so on. For processes supported by the compiler, the information would be copied into managed memory while unmanaged memory was freed in any case. But, as we have discovered, it did not work for the environment variables. For each process, about 500 Kbytes on average failed to be freed. It didn't cause any serious troubles with previous projects (at least we didn't notice anything and users didn't complain either). But when building LibreOffice through Make, a huge number of processes are run which do not refer to the compiler. During the several hours of the build process, more than one hundred thousand processes were launched, which resulted in "piling up" of total 25 Gbytes. After fixing the issue, the size of the memory used by the monitoring system dropped to 1,8 Gbytes.
The whole build process, including library compilation, contained 12245 source files. Unfortunately, the analysis process for such a huge number of files took about 15 hours, so we made some optimizations in the analyzer kernel that allowed us to re-analyze the project in as few as 9 hours. It is twice the project build time but this speed is still quite adequate and satisfying.
If the analyzer can't figure out some constructs in the source code, it generates the V001 message for that file. It skips this fragment, which very rarely affects the analysis results. However, we studied and fixed all the V001 messages for this project.
Old path format
When checking the project, we discovered that the system paths had been defined in the old format, for example "C:/PROGRA~2/MICROS~4.0/VC/include". This format is fully supported by the analyzer kernel and plugin but the message filtering mechanism failed for the system files, so we had to make some fixes.
This issue doesn't quite refer to the PVS-Studio products. The PVS-Studio Standalone utility where LibreOffice was checked has recently got a better file navigation mechanism which now allows navigation by included headers and search for types and variables in dependent files. All the dependencies are collected during the check and saved in the same folder with the *.plog file. Unfortunately, the standard class System.Runtime.Serialization.Formatters.Binary.BinaryFormatter cannot serialize structures of a large size - an internal exception is thrown, so now we use the Protocol Buffers library which is very good at this task.
The check of the LibreOffice project resulted in an article aiming at improving one more open-source project, as well as useful fixes made in the PVS-Studio products. The article about the bugs found in LibreOffice will be published soon. And we want to say thank you to the LibreOffice project that has helped us make our analyzer better!