To get a trial key
fill out the form below
Team License (a basic version)
Enterprise License (extended version)
* By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
* By clicking this button you agree to our Privacy Policy statement

Free PVS-Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at

If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

Archeology for Entertainment, or Checki…

Archeology for Entertainment, or Checking Microsoft Word 1.1a with PVS-Studio

Apr 02 2014

The Microsoft company has recently made a present to all programmers eager to dig into some interesting stuff: they revealed the source codes of MS-DOS v 1.1, v 2.0 and Word for Windows 1.1a. The MS-DOS operating system is written in assembler, so the analyzer cannot be applied to it. But Word is written in C. Word 1.1a's source codes are almost 25 years old, but we still managed to analyze it. There's no practical use of it, of course. Just for fun.

Analysis based on pre-generated preprocessed files is no longer supported. Please consider using the Compiler Monitoring mode instead. This mode allows a generation and analysis of preprocessed files automatically during regular build process.


Where to find the source files

Perhaps many will like not this article itself, but the fact that one can download the source codes of MS-DOS v 1.1, v 2.0 and Word for Windows 1.1a. Those interested in digging the source files on their own should check the original source.

The press release: Computer History Museum Makes Historic MS-DOS and Word for Windows Source Code Available to the Public.

Checking Word 1.1a


Figure 1. Word for Windows 1.1a (click on the picture to enlarge).

Word for Windows 1.1a was released in 1990. Its source code was made public available on March 25, 2014. Word has always been a flagship product of Microsoft, and I, as well as many other programmers, was very eager to peek into the inside of the software product that so much contributed to Microsoft's commercial success.

I decided to check Word 1.1a's code with our tool PVS-Studio. It is a static analyzer for C/C++ code. That task was not that easy to fulfill, of course, as the analyzer is designed to work with projects developed at least in Visual Studio 2005. And now I had C source codes more than 20 years old. We can fairly call them a find from the prehistoric times. At least, the C language standard didn't exist then yet and every compiler had to be by itself. Fortunately, Word 1.1a's source codes appeared to be free of any specific nuances and abuse of non-standard compiler extensions.

Before you can perform code analysis, you need to get preprocessed files (*.i). Once you have generated them, you can use the PVS-Studio Standalone tool to run the analysis and examine the diagnostic messages. Of course, the analyzer is not designed to check 16-bit programs, but the results I got were quite enough to satisfy my curiosity. After all, a meticulous analysis of a 24 year old project just wouldn't make any sense.

So the basic obstacle was in obtaining the preprocessed files for the source codes. I asked my co-worker to find some solution, and he approached the task with much creativity: he chose to use GCC 4.8.1 to get the preprocessed files. I guess no one has ever mocked at Word 1.1's source codes in such a cruel way. How could it have occurred to him at all to use GCC? That ingenious imagination of his!

What's most interesting, it all turned out pretty fine. He wrote a small utility to run preprocessing by GCC 4.8.1 of each file from the folder it was stored in. As it displayed error messages regarding troubles with locating and including header files, we added -I switches into the launch parameters to specify the paths to the required files. A couple of header files which we failed to find were created empty. All the rest troubles with #include expanding were related to including resources, so we commented them out. The WIN macro was defined for preprocessing as the code contained branches both for WIN and MAC.

After that, PVS-Studio Standalone and I came into play. I noted down a few suspicious code fragments I want to show you. But let's first speak a bit more about the project itself.

A few words about Word 1.1a's code

The most complex functions

The following functions showed the highest cyclomatic complexity:

  • CursUpDown - 219;
  • FIdle - 192;
  • CmdDrCurs1 - 142.

#ifdef WIN23

While looking through the source codes, I came across "#ifdef WIN23" and couldn't help smiling. I even noted that fragment down. I thought it was a typo and the correct code was #ifdef WIN32.

When I saw WIN23 for the second time, I grew somewhat doubtful. And just then it struck me that I was viewing source files as old as 24 years by the moment. WIN23 stood for Windows 2.3.

Stern times

In some code fragment, I stumbled upon the following interesting line.

Assert((1 > 0) == 1);

It seems incredible that this condition can ever be false. But since there is such a check, there must be the reason for it. There was no language standard at that time. As far as I get it, it was a good style to check that the compiler's work met programmers' expectations.

Well, if we agree to treat K&R as a standard, the ((1 > 0) == 1) condition is always true, of course. But K&R was just a de facto standard. So it's just a check of the compiler's adequacy.

Analysis results

Now let's discuss the suspicious fragments I have found in the code. I guess it's the main reason why you are reading this article. So here we go.

Infinite loop

void GetNameElk(elk, stOut)
ELK elk;
unsigned char *stOut;
  unsigned char *stElk = &rgchElkNames[mpelkichName[elk]];
  unsigned cch = stElk[0] + 1;

  while (--cch >= 0)
    *stOut++ = *stElk++;

PVS-Studio's diagnostic message: V547 Expression '-- cch >= 0' is always true. Unsigned type value is always >= 0. mergeelx.c 1188

The "while (--cch >= 0)" loop will never terminate. The 'cch' variable is unsigned, which means that it will always be >= 0, however long you may decrease it.

A typo leading to an array overrun

uns rgwSpare0 [5];

  printUns ("rgwSpare0[0]   = ", Fib.rgwSpare0[5], 0, 0, fTrue);
  printUns ("rgwSpare0[1]   = ", Fib.rgwSpare0[1], 1, 1, fTrue);
  printUns ("rgwSpare0[2]   = ", Fib.rgwSpare0[2], 0, 0, fTrue);
  printUns ("rgwSpare0[3]   = ", Fib.rgwSpare0[3], 1, 1, fTrue);
  printUns ("rgwSpare0[4]   = ", Fib.rgwSpare0[4], 2, 2, fTrue);

PVS-Studio's diagnostic message: V557 Array overrun is possible. The '5' index is pointing beyond array bound. dnatfile.c 444

It turned out that the first line for some reason contains the text Fib.rgwSpare0[5]. That's incorrect: there are just 5 items in the array, therefore the largest index should be 4. The value '5' is just a typo. A zero index should have most likely been used in the first string:

printUns ("rgwSpare0[0]   = ", Fib.rgwSpare0[0], 0, 0, fTrue);

Uninitialized variable

FPrintSummaryInfo(doc, cpFirst, cpLim)
int doc;
CP cpFirst, cpLim;
  int fRet = fFalse;
  int pgnFirst = vpgnFirst;
  int pgnLast = vpgnLast;
  int sectFirst = vsectFirst;
  int sectLast = sectLast;

PVS-Studio's diagnostic message: V573 Uninitialized variable 'sectLast' was used. The variable was used to initialize itself. print2.c 599

The 'sectLast' variable is assigned to itself:

int sectLast = sectLast;

I suspect that it should have been initialized to the 'vsectLast' variable instead:

int sectLast = vsectLast;

I found one more error of that kind - must be a consequence of using the Copy-Paste method:

V573 Uninitialized variable 'sectLast' was used. The variable was used to initialize itself. print2.c 719

Undefined behavior

  static int  iBitmap = 0;
  iBitmap = ++iBitmap % MAXBITMAP;

PVS-Studio's diagnostic message: V567 Undefined behavior. The 'iBitmap' variable is modified while being used twice between sequence points. ddedit.c 107

I don't know how people used to treat such code 20 years ago, but in our times it is treated as hooliganism as it leads to undefined behavior.

Other fragments with similar issues:

  • V567 Undefined behavior. The 'iIcon' variable is modified while being used twice between sequence points. ddedit.c 132
  • V567 Undefined behavior. The 'iCursor' variable is modified while being used twice between sequence points. ddedit.c 150

Unsuccessful call of the printf() function

  int     cb;
  int     err;
  printf("\n - %d strings were read, "
         "%d were expected (decimal numbers) -\n");

PVS-Studio's diagnostic message: V576 Incorrect format. A different number of actual arguments is expected while calling 'printf' function. Expected: 3. Present: 1. dini.c 498

The printf() function is a variadic function. Passing or not passing arguments to it are both legal. In this case, the programmer forgot about the arguments, and it resulted in printing garbage all the time.

Uninitialized pointers

One of the auxiliary utilities included into the package of Word source files contains a very strange piece of code.

main(argc, argv)
int argc;
char * argv [];
  FILE * pfl;
  for (argi = 1; argi < argc; ++argi)
    if (FWild(argv[argi]))
      FEnumWild(argv[argi], FEWild, 0);
      FEWild(argv[argi], 0);


PVS-Studio's diagnostic message: V614 Uninitialized pointer 'pfl' used. Consider checking the first actual argument of the 'fclose' function. eldes.c 87

The 'pfl' variable is initialized neither before the loop nor inside it, while the fclose(pfl) function is called multiple times. It all, however, may have worked pretty well. The function would return an error status and the program would go on running.

And here's another dangerous function which will most likely cause a program crash.

FPathSpawn( rgsz )
char *rgsz[];
{ /* puts the correct path at the beginning of rgsz[0]
     and calls FSpawnRgsz */
  char *rgsz0;

  strcpy(rgsz0, szToolsDir);
  strcat(rgsz0, "\\");
  strcat(rgsz0, rgsz[0]);
  return FSpawnRgsz(rgsz0, rgsz);

PVS-Studio's diagnostic message: V614 Uninitialized pointer 'rgsz0' used. Consider checking the first actual argument of the 'strcpy' function. makeopus.c 961

The ' rgsz0' pointer is not initialized to anything. However, it doesn't prevent copying a string into it.

Typo in a condition

#define wkHdr    0x4000
#define wkFtn    0x2000
#define wkAtn    0x0008
#define wkSDoc    (wkAtn+wkFtn+wkHdr)

CMD CmdGoto (pcmb)
CMB * pcmb;
  int wk = PwwdWw(wwCur)->wk;
    if (wk | wkSDoc)
      NewCurWw((*hmwdCur)->wwUpper, fTrue);

PVS-Studio's diagnostic message: V617 Consider inspecting the condition. The '(0x0008 + 0x2000 + 0x4000)' argument of the '|' bitwise operation contains a non-zero value. dlgmisc.c 409

The (wk | wkSDoc) condition is always true. The programmer must have actually meant to write the following code instead:

if (wk & wkSDoc)

That is, the | and & operators are swapped by mistake.

And finally a lengthy but simple sample

int TmcCharacterLooks(pcmb)
CMB * pcmb;
  if (qps < 0)
    pcab->wCharQpsSpacing = -qps;
    pcab->iCharIS = 2;
  else  if (qps > 0)
    pcab->iCharIS = 1;
    pcab->iCharIS = 0;
  if (hps < 0)
    pcab->wCharHpsPos = -hps;
    pcab->iCharPos = 2;
  else  if (hps > 0)
    pcab->iCharPos = 1;
    pcab->iCharPos = 1;

PVS-Studio's diagnostic message: V523 The 'then' statement is equivalent to the 'else' statement. dlglook1.c 873

When working with the 'qps' variable, the following values are written into 'pcab->iCharIS': 2, 1, 0.

The 'hps' variable is handled in a similar way, but in this case, some suspicious values are saved into the variable 'pcab->iCharPos': 2, 1, 1.

It must be a typo: a zero was most likely meant to be used at the very end.


I have found very few strange fragments. There are two reasons for that. Firstly, I found the code to be skillfully and clearly written. Secondly, the analysis had to be incomplete, while teaching the analyzer the specifics of the old C language wouldn't be of any use.

I hope you have enjoyed a few minutes of interesting reading. Thank you for attention. And welcome to try the PVS-Studio analyzer on your code.

Popular related articles
PVS-Studio for Java

Date: Jan 17 2019

Author: Andrey Karpov

In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've co…
PVS-Studio ROI

Date: Jan 30 2019

Author: Andrey Karpov

Occasionally, we're asked a question, what monetary value the company will receive from using PVS-Studio. We decided to draw up a response in the form of an article and provide tables, which will sho…
Free PVS-Studio for those who develops open source projects

Date: Dec 22 2018

Author: Andrey Karpov

On the New 2019 year's eve, a PVS-Studio team decided to make a nice gift for all contributors of open-source projects hosted on GitHub, GitLab or Bitbucket. They are given free usage of PVS-Studio s…
The way static analyzers fight against false positives, and why they do it

Date: Mar 20 2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
The Ultimate Question of Programming, Refactoring, and Everything

Date: Apr 14 2016

Author: Andrey Karpov

Yes, you've guessed correctly - the answer is "42". In this article you will find 42 recommendations about coding in C++ that can help a programmer avoid a lot of errors, save time and effort. The au…
The Last Line Effect

Date: May 31 2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…
Appreciate Static Code Analysis!

Date: Oct 16 2017

Author: Andrey Karpov

I am really astonished by the capabilities of static code analysis even though I am one of the developers of PVS-Studio analyzer myself. The tool surprised me the other day as it turned out to be sma…
Characteristics of PVS-Studio Analyzer by the Example of EFL Core Libraries, 10-15% of False Positives

Date: Jul 31 2017

Author: Andrey Karpov

After I wrote quite a big article about the analysis of the Tizen OS code, I received a large number of questions concerning the percentage of false positives and the density of errors (how many erro…
How PVS-Studio Proved to Be More Attentive Than Three and a Half Programmers

Date: Oct 22 2018

Author: Andrey Karpov

Just like other static analyzers, PVS-Studio often produces false positives. What you are about to read is a short story where I'll tell you how PVS-Studio proved, just one more time, to be more atte…
The Evil within the Comparison Functions

Date: May 19 2017

Author: Andrey Karpov

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want t…

Comments (0)

Next comments
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →