>
>
>
One Doesn't Simply Edit Subtitles

Sviatoslav Razmyslov
Articles: 90

One Doesn't Simply Edit Subtitles

How many people use subtitles worldwide? Probably, a lot. In the Internet you can find subtitles for almost any film in many languages for educational purposes or just because of love to the original sound. All this is created in special programs. As in most programs, Subtitle Edit has not been without surprises in the form of bugs.

Introduction

Subtitle Edit is a free editor with a huge list of abilities. This is a great project written in C# with open source code. The program is very popular and is issued in the first lines of the results of search engines, the project website lists numerous awards. In a repository on GitHub, you can see that the project is actively developing, has a lot of stars and forks. Generally speaking, it's a good project for taking part in its development. Originally, I was just looking for a library for subtitles parsing, because most subtitle formats are not text, but I'll get back to my project later.

There are 310 open Issues at the project page on GitHub. Perhaps, the work with the analysis results will allow to fix something. PVS-Studio, a static analyzer, which was used to analyze the code, issued 460 warnings (total for all levels of certainty). Almost everything can and should be corrected. This is related to the fact that there are almost no recommendatory diagnostics in the analyzer. Found results generally indicate the real problems in code. In this article, I'll provide examples of code, but I'll choose only those errors that can strongly impact the work.

I will send a Pull Request for more or less understandable code fragments with corrections. But it's much better for the author of the project to get acquainted with all the results of the analysis by reviewing the project by himself.

Ignoring the styles

Here is the way how the form fragment for the style specification for the subtitles looks like:

And here's the analyzer warning for the code that is associated with this form:

V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 300, 302. SubStationAlphaStyles.cs 300

public static void AddStyle(ListView lv, SsaStyle ssaStyle,
  Subtitle subtitle, bool isSubstationAlpha)
{
  ....
  if (ssaStyle.Bold || ssaStyle.Italic)
    subItem.Font = new Font(...., FontStyle.Bold |
                                  FontStyle.Italic);
  else if (ssaStyle.Bold)
    subItem.Font = new Font(...., FontStyle.Bold);
  else if (ssaStyle.Italic)
    subItem.Font = new Font(...., FontStyle.Italic);
  else if (ssaStyle.Italic)
    subItem.Font = new Font(...., FontStyle.Regular);
  ....
}

The analyzer issued just 4 warnings on this code fragment. It is not surprising, because there was an error in almost every line. Moreover, the option with ssaStyle.Underline is not considered here.

It's better to rewrite the code as follows and do it very carefully:

....
if (ssaStyle.Bold)
  fontStyles |= FontStyle.Bold;
....
subItem.Font = new Font(...., fontStyles);
....

The last paragraph of the text is not removing

V3022 CWE-570 Expression '_networkSession != null && _networkSession.LastSubtitle != null && i < _networkSession.LastSubtitle.Paragraphs.Count' is always false. Main.cs 7242

private void DeleteSelectedLines()
{
  ....
  if (_networkSession != null)                // <=
  {
    _networkSession.TimerStop();
    NetworkGetSendUpdates(indices, 0, null);
  }
  else
  {
    indices.Reverse();
    foreach (int i in indices)
    {
      _subtitle.Paragraphs.RemoveAt(i);
      if (_networkSession != null &&          // <=
          _networkSession.LastSubtitle != null &&
          i < _networkSession.LastSubtitle.Paragraphs.Count)
        _networkSession.LastSubtitle.Paragraphs.RemoveAt(i);
    }
  ....
  }
  ....
}

The variable _networkSession was already verified in the first condition, therefore, in the else branch it'll definitely be null. Such a combination of checks has led to a false condition and unreachable code.

Loss of functionality due to typos

V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 113, 115. SsaStyle.cs 113

public string ToRawSsa(string styleFormat)
{
  var sb = new StringBuilder();
  sb.Append("Style: ");
  var format = ....;
  for (int i = 0; i < format.Length; i++)
  {
    string f = format[i].Trim();
    if (f == "name")
      sb.Append(Name);
    ....
    else if (f == "shadow")    // <=
      sb.Append(OutlineWidth); // <=
    else if (f == "shadow")    // <=
      sb.Append(ShadowWidth);  // <=
    ....
  }
  ....
}

Typos in the conditions lead to the appearance of unreachable code branches. Very often such code is a consequence of Copy-Paste programming. In the above example, the second repeated condition will never be executed. And this is the most simple and compact example that I chose from the article. Many similar examples have been found to describe the problem in a separate section.

Here's the entire list of Copy-Paste code that requires fixing:

  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 268, 270. ExportCustomTextFormat.cs 268
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 278, 280. ExportCustomTextFormat.cs 278
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 220, 252. SetSyncPoint.cs 220
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 712, 743. ExportPngXml.cs 712
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 712, 743. ExportPngXml.cs 712
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 162, 178. LambdaCap.cs 162
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 166, 182. LambdaCap.cs 166
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 170, 186. LambdaCap.cs 170
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 174, 190. LambdaCap.cs 174
  • V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 398, 406. Ebu.cs 398
  • V3021 CWE-561 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless FinalCutProTest2Xml.cs 22
  • V3021 CWE-561 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless FinalCutProTextXml.cs 21
  • V3021 CWE-561 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless FinalCutProXml.cs 22

Something's wrong with a picture size of 720 x 480

V3022 CWE-570 Expression 'param.Bitmap.Width == 720 && param.Bitmap.Width == 480' is always false. Probably the '||' operator should be used here. ExportPngXml.cs 1808

private static string FormatFabTime(TimeCode time,
                                    MakeBitmapParameter param)
{
  if (param.Bitmap.Width == 720 && param.Bitmap.Width == 480)
    return $"....";

  // drop frame
  if (Math.Abs(param.... - 24 * (999 / 1000)) < 0.01 ||
      Math.Abs(param.... - 29 * (999 / 1000)) < 0.01 ||
      Math.Abs(param.... - 59 * (999 / 1000)) < 0.01)
      return $"....";

  return $"....";
}

Confusion with Width and Height is a classic example of a typo. But in this function there is another suspicious thing. All reductions of the strings that I replaced with four dots, are the same strings: {time.Hours:00};{time.Minutes:00};{time.Seconds:00};{SubtitleFormat.MillisecondsToFramesMaxFrameRate(time.Milliseconds):00}. I.e. two conditions do not affect the result of the function, the function always returns the same thing.

Downloading of "matroska" is always successful

V3009 CWE-393 It's odd that this method always returns one and the same value of 'true'. Main.cs 10153

private bool LoadTextSTFromMatroska(
  MatroskaTrackInfo matroskaSubtitleInfo,
  MatroskaFile matroska,
  bool batchMode)
{
  ....
  _fileDateTime = new DateTime();
  _converted = true;
  if (batchMode)
      return true;

  SubtitleListview1.Fill(_subtitle, _subtitleAlternate);
  if (_subtitle.Paragraphs.Count > 0)
      SubtitleListview1.SelectIndexAndEnsureVisible(0);

  ShowSource();
  return true;
}

A function is found that always returns the true value. Perhaps, it's an error. The value of this function is checked in four places of the program. Also near there are similar functions in code, for example, LoadDvbFromMatroska(), and it returns different values.

Useless or incorrect code

V3022 CWE-571 Expression 'listBoxVobFiles.Items.Count > 0' is always true. DvdSubRip.cs 533

private void DvdSubRip_Shown(object sender, EventArgs e)
{
  if (string.IsNullOrEmpty(_initialFileName))
    return;

  if (_initialFileName.EndsWith(".ifo", ....))
  {
    OpenIfoFile(_initialFileName);
  }
  else if (_initialFileName.EndsWith(".vob", ....))
  {
    listBoxVobFiles.Items.Add(_initialFileName);
    buttonStartRipping.Enabled = listBoxVobFiles.Items.Count > 0;
  }
  _initialFileName = null;
}

An element is added in the listBoxVobFiles list and then checked if the list is empty. It is obvious that there will be at least one element. And there are more than thirty checks in the project that are always true or false.

Just a fun example

V3005 The 'positionInfo' variable is assigned to itself. WebVTT.cs 79

internal static string GetPositionInfoFromAssTag(Paragraph p)
{
  ....
  if (!string.IsNullOrEmpty(line))
  {
    if (positionInfo == null)
      positionInfo = " line:" + line;
    else
      positionInfo = positionInfo += " line:" + line;
  }
  ....
}

Choosing between the options of recording "A = A + n" and "A += n", the author of this code chose a compromise variant "A = A += n" :D

Conclusion

To understand how to fix the analyzer warning, one needs to have the understanding of the subtitle formats and features of their processing. So, if there are those who wish to support the project and provide the author of the project on GitHub with Pull Requests with corrections, here is the link to download the PVS-Studio HTML report with warnings of High/Medium levels.