Webinar: Parsing C++ - 10.10
How many people use subtitles worldwide? Probably, a lot. In the Internet you can find subtitles for almost any film in many languages for educational purposes or just because of love to the original sound. All this is created in special programs. As in most programs, Subtitle Edit has not been without surprises in the form of bugs.
Subtitle Edit is a free editor with a huge list of abilities. This is a great project written in C# with open source code. The program is very popular and is issued in the first lines of the results of search engines, the project website lists numerous awards. In a repository on GitHub, you can see that the project is actively developing, has a lot of stars and forks. Generally speaking, it's a good project for taking part in its development. Originally, I was just looking for a library for subtitles parsing, because most subtitle formats are not text, but I'll get back to my project later.
There are 310 open Issues at the project page on GitHub. Perhaps, the work with the analysis results will allow to fix something. PVS-Studio, a static analyzer, which was used to analyze the code, issued 460 warnings (total for all levels of certainty). Almost everything can and should be corrected. This is related to the fact that there are almost no recommendatory diagnostics in the analyzer. Found results generally indicate the real problems in code. In this article, I'll provide examples of code, but I'll choose only those errors that can strongly impact the work.
I will send a Pull Request for more or less understandable code fragments with corrections. But it's much better for the author of the project to get acquainted with all the results of the analysis by reviewing the project by himself.
Here is the way how the form fragment for the style specification for the subtitles looks like:
And here's the analyzer warning for the code that is associated with this form:
V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 300, 302. SubStationAlphaStyles.cs 300
public static void AddStyle(ListView lv, SsaStyle ssaStyle,
Subtitle subtitle, bool isSubstationAlpha)
{
....
if (ssaStyle.Bold || ssaStyle.Italic)
subItem.Font = new Font(...., FontStyle.Bold |
FontStyle.Italic);
else if (ssaStyle.Bold)
subItem.Font = new Font(...., FontStyle.Bold);
else if (ssaStyle.Italic)
subItem.Font = new Font(...., FontStyle.Italic);
else if (ssaStyle.Italic)
subItem.Font = new Font(...., FontStyle.Regular);
....
}
The analyzer issued just 4 warnings on this code fragment. It is not surprising, because there was an error in almost every line. Moreover, the option with ssaStyle.Underline is not considered here.
It's better to rewrite the code as follows and do it very carefully:
....
if (ssaStyle.Bold)
fontStyles |= FontStyle.Bold;
....
subItem.Font = new Font(...., fontStyles);
....
V3022 CWE-570 Expression '_networkSession != null && _networkSession.LastSubtitle != null && i < _networkSession.LastSubtitle.Paragraphs.Count' is always false. Main.cs 7242
private void DeleteSelectedLines()
{
....
if (_networkSession != null) // <=
{
_networkSession.TimerStop();
NetworkGetSendUpdates(indices, 0, null);
}
else
{
indices.Reverse();
foreach (int i in indices)
{
_subtitle.Paragraphs.RemoveAt(i);
if (_networkSession != null && // <=
_networkSession.LastSubtitle != null &&
i < _networkSession.LastSubtitle.Paragraphs.Count)
_networkSession.LastSubtitle.Paragraphs.RemoveAt(i);
}
....
}
....
}
The variable _networkSession was already verified in the first condition, therefore, in the else branch it'll definitely be null. Such a combination of checks has led to a false condition and unreachable code.
V3003 CWE-570 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 113, 115. SsaStyle.cs 113
public string ToRawSsa(string styleFormat)
{
var sb = new StringBuilder();
sb.Append("Style: ");
var format = ....;
for (int i = 0; i < format.Length; i++)
{
string f = format[i].Trim();
if (f == "name")
sb.Append(Name);
....
else if (f == "shadow") // <=
sb.Append(OutlineWidth); // <=
else if (f == "shadow") // <=
sb.Append(ShadowWidth); // <=
....
}
....
}
Typos in the conditions lead to the appearance of unreachable code branches. Very often such code is a consequence of Copy-Paste programming. In the above example, the second repeated condition will never be executed. And this is the most simple and compact example that I chose from the article. Many similar examples have been found to describe the problem in a separate section.
Here's the entire list of Copy-Paste code that requires fixing:
V3022 CWE-570 Expression 'param.Bitmap.Width == 720 && param.Bitmap.Width == 480' is always false. Probably the '||' operator should be used here. ExportPngXml.cs 1808
private static string FormatFabTime(TimeCode time,
MakeBitmapParameter param)
{
if (param.Bitmap.Width == 720 && param.Bitmap.Width == 480)
return $"....";
// drop frame
if (Math.Abs(param.... - 24 * (999 / 1000)) < 0.01 ||
Math.Abs(param.... - 29 * (999 / 1000)) < 0.01 ||
Math.Abs(param.... - 59 * (999 / 1000)) < 0.01)
return $"....";
return $"....";
}
Confusion with Width and Height is a classic example of a typo. But in this function there is another suspicious thing. All reductions of the strings that I replaced with four dots, are the same strings: {time.Hours:00};{time.Minutes:00};{time.Seconds:00};{SubtitleFormat.MillisecondsToFramesMaxFrameRate(time.Milliseconds):00}. I.e. two conditions do not affect the result of the function, the function always returns the same thing.
V3009 CWE-393 It's odd that this method always returns one and the same value of 'true'. Main.cs 10153
private bool LoadTextSTFromMatroska(
MatroskaTrackInfo matroskaSubtitleInfo,
MatroskaFile matroska,
bool batchMode)
{
....
_fileDateTime = new DateTime();
_converted = true;
if (batchMode)
return true;
SubtitleListview1.Fill(_subtitle, _subtitleAlternate);
if (_subtitle.Paragraphs.Count > 0)
SubtitleListview1.SelectIndexAndEnsureVisible(0);
ShowSource();
return true;
}
A function is found that always returns the true value. Perhaps, it's an error. The value of this function is checked in four places of the program. Also near there are similar functions in code, for example, LoadDvbFromMatroska(), and it returns different values.
V3022 CWE-571 Expression 'listBoxVobFiles.Items.Count > 0' is always true. DvdSubRip.cs 533
private void DvdSubRip_Shown(object sender, EventArgs e)
{
if (string.IsNullOrEmpty(_initialFileName))
return;
if (_initialFileName.EndsWith(".ifo", ....))
{
OpenIfoFile(_initialFileName);
}
else if (_initialFileName.EndsWith(".vob", ....))
{
listBoxVobFiles.Items.Add(_initialFileName);
buttonStartRipping.Enabled = listBoxVobFiles.Items.Count > 0;
}
_initialFileName = null;
}
An element is added in the listBoxVobFiles list and then checked if the list is empty. It is obvious that there will be at least one element. And there are more than thirty checks in the project that are always true or false.
V3005 The 'positionInfo' variable is assigned to itself. WebVTT.cs 79
internal static string GetPositionInfoFromAssTag(Paragraph p)
{
....
if (!string.IsNullOrEmpty(line))
{
if (positionInfo == null)
positionInfo = " line:" + line;
else
positionInfo = positionInfo += " line:" + line;
}
....
}
Choosing between the options of recording "A = A + n" and "A += n", the author of this code chose a compromise variant "A = A += n" :D
To understand how to fix the analyzer warning, one needs to have the understanding of the subtitle formats and features of their processing. So, if there are those who wish to support the project and provide the author of the project on GitHub with Pull Requests with corrections, here is the link to download the PVS-Studio HTML report with warnings of High/Medium levels.
0