This idea of this article came to my mind after one discussion in a Telegram chat. Someone posted a program for changing the file's MD5 hash. Another chat participant checked this program with Virustotal and found 2 suspicious (and 68 safe) results. After the check, this participant accused the program of having malicious functionality (and even stealing passwords from accounts), and all who installed it — of missing some brain cells. We tried to exhort him and explain that false positives may occur here, but failed. The conversation ceased to be adequate and ended.
We published and translated this article with the copyright holder's permission. The author is Stariy. The article was originally published on Habr.
Figure 1. Virustotal
However, I (a participant of this conversation) started eating, breathing, and sleeping this issue. On the one hand, if the antivirus finds something, there's no reason for us not to believe it — we must check these issues out. On the other hand, these are not the most popular antiviruses, nothing to worry about. But the most important question is — if there were 0 issues detected, would we be so sure about the program's safety? What to do in this case? Plus, I was wondering, how does one change the MD5 hash, by adding extra bytes (the most obvious way) or by doing something smarter than this?
So, I decided to check it and describe my thoughts and actions in this article. Perhaps someone will find it useful. I'm not pretending to be an expert, we'll just poke around.
So, I have the MD5_Hash_Changer.exe file, and I'm suspecting that something's going on in this file. First, let's inspect it with PEiD:
Figure 2. PEiD
The field with C#/.NET implies that the program is written in C#. Therefore, in some cases one can work with the code without a disassembler. So, I download the free JetBrains dotPeek program, which allows me to get the C# code from the exe file (assuming, of course, that the program's in C#). Then I run dotPeek on the inspected file:
Figure 3. Inspecting the program in dotPeek
First, let's look at the Metadata section and inspect the used strings that may contain interesting names, paths, IP addresses, and others.
Figure 4. String resources in dotPeek
If necessary, I can immediately see where exactly an interesting string is used, if there is one. In my case, there was nothing suspicious, and I moved on to the section with code. As it turned out, the source code of the program contains two classes — Program and MainForm. The Program class is quite standard and contains only the code that launches the main application window:
using System; using System.Windows.Forms;
namespace MD5_Hash_Changer {
internal static class Program {
[STAThread] private static void Main() {
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run((Form) new MainForm());
}
}
}
The MainForm class is much larger — let's inspect it in more detail:
Figure 5. Application code
Apparently, when the form starts running, the InitializeComponent() function starts running too. However, there's nothing interesting in this function: the usual interface configuration, setting fonts, button names, and other routine. I had to inspect the entire code, but found no hints of network activity or attempts to access files "superfluous" for the program. Everything is very transparent and naive. Well, since I didn't find malicious code, then at least I'll take a look at the algorithm to understand how this program changes files.
The function below is responsible for this action:
private void changeMD5(string[] fileNames) {
Random random = new Random();
Thread.Sleep(1000);
this.Invoke((Delegate) (() => this.btnStartMD5.Enabled = true));
for (int i = 0; i < fileNames.Length; ++i) {
if (!this.running) {
this.Invoke((Delegate) (() => {
this.btnStartMD5.Text = "Start Change MD5";
this.running = false;
}));
break;
}
int length1 = random.Next(2, 7);
byte[] buffer = new byte[length1];
for (int index = 0; index < length1; ++index)
buffer[index] = (byte) 0;
long length2 = new FileInfo(fileNames[i]).Length;
if (length2 == 0L) {
this.Invoke(
(Delegate) (() => this.dgvMD5.Rows[i].Cells[3].Value = (object) "Empty")
);
}
else {
using (FileStream fileStream = new FileStream(fileNames[i],
FileMode.Append))
fileStream.Write(buffer, 0, buffer.Length);
int bufferSize = length2 > 1048576L ? 1048576 : 4096;
string md5hash = "";
using (MD5 md5 = MD5.Create()) {
using (FileStream inputStream = new FileStream(fileNames[i],
FileMode.Open,
FileAccess.Read,
FileShare.Read,
bufferSize))
md5hash = BitConverter.ToString(md5.ComputeHash((Stream) inputStream))
.Replace("-", "");
}
this.Invoke((Delegate) (() => {
if (this.dgvMD5.Rows[i].Cells[2].Value.ToString() != "")
this.dgvMD5.Rows[i].Cells[1].Value =
this.dgvMD5.Rows[i].Cells[2].Value;
this.labelItem.Text = (i + 1).ToString();
this.progressBarStatus.Value = i + 1;
this.dgvMD5.Rows[i].Cells[2].Value = (object) md5hash;
this.dgvMD5.Rows[i].Cells[3].Value = (object) "OK";
}));
}
}
this.Invoke((Delegate) (() => {
this.btnStartMD5.Text = "Start Change MD5"; this.running = false;
}));
}
As the input, the function receives a list of files that should be processed. Then the function iterates these files within the loop. A buffer of random length (from 2 to 7 bytes) is generated for each file and filled with zeros:
int length1 = random.Next(2, 7);
byte[] buffer = new byte[length1];
for (int index = 0; index < length1; ++index)
buffer[index] = (byte) 0;
Then this buffer is written to the end of the file:
using (FileStream fileStream = new FileStream(fileNames[i],
FileMode.Append))
fileStream.Write(buffer, 0, buffer.Length);
Then the MD5 hash is calculated again, but this time for the modified file:
using (FileStream inputStream = new FileStream(fileNames[i],
FileMode.Open,
FileAccess.Read,
FileShare.Read,
bufferSize))
md5hash = BitConverter.ToString(md5.ComputeHash((Stream) inputStream))
.Replace("-", "");
That's it. Nothing else interesting happens here. As you can see, the program's very trivial and changes files in some way, but... It's up to you to decide if this program may be useful for you.
And finally, let's check what's happening with these files. Let's take the second picture from this article (named Figure 1) and run a program on it. Then let's compare the file before the processing with the file after it.
Figure 6. Comparing files
First, the processed file's size increased by 6 bytes. Second, the screenshot shows that 6 zero bytes appeared in the end of the file. Obviously, this is exactly the same algorithm I expected to see after studying the source code.
In the end, I should note that the check I described can't make us 100% sure if the malicious functionality is absent in code. There are ways to implement such functionality in the exe file at a lower level. That's why I urge you to analyze the possible network traffic after launching the program in a sandbox, as well as to thoroughly inspect the executed code — but this may require specific skills and expertise. The algorithm shown here, though, is available even to an unexperienced user who's way far from the reverse.