One way or another, vibe coding is becoming—and in some cases has already become—part of the software development process. The PVS-Studio team sees this as a new challenge for static code analyzers: finding errors in code caused by the use of AI assistants and similar tools.
There are two schools of thought.
The first one includes directors, HR, and "efficiency managers" who are already planning layoffs in the IT department. They genuinely believe that ChatGPT will soon write all code and developers will no longer be needed. At best, they'll keep a couple of "prompt engineers" and that's it. Why pay a developer when you can simply ask the AI to "make an online store"?
The second school of thought is more difficult to comprehend, as senior developers are laughing so hard, they can't articulate it properly.
(c) Some Internet meme
People have mixed opinions on vibe coding and its variations. I agree that using generated code when the programmer doesn't fully understand it can compromise an application's safety and security. So, static analyzers have even more work on their hands :)
AI-generated code may contain new and unfamiliar types of errors. Thus, to study them and learn how to identify them, it is first necessary to gather a collection.
If you ask AI to generate a sorting function, it will do so without any errors. However, project development involves more than just writing separate functions that solve specific, simple tasks. As the project grows, error density increases due to how complex interactions between different entities become.
My colleague, who has been experimenting with AI, recently sent me a bug report:
"I have a funny story about AI making a mistake. I asked it to implement comparison methods for my Warning
class."
public class Warning
{
public string ErrorCode { get; set; }
public string Message { get; set; }
public uint Level { get; set; }
public List<Position> Trace { get; set; }
}
public class Position
{
public int Line { get; set; }
public string FilePath { get; set; }
}
Here is a fragment of the AI's implementation:
public bool Equals(Warning other)
{
if (other is null) return false;
if (ReferenceEquals(this, other)) return true;
return ErrorCode == other.ErrorCode &&
Message == other.Message &&
Level == other.Level &&
(Trace == other.Trace ||
(Trace != null && other.Trace != null &&
Trace.SequenceEqual(other.Trace)));
}
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + (ErrorCode?.GetHashCode() ?? 0);
hash = hash * 23 + (Message?.GetHashCode() ?? 0);
hash = hash * 23 + Level.GetHashCode();
hash = hash * 23 + (Trace?.GetHashCode() ?? 0);
return hash;
}
}
There's an error in the line of the GetHashCode
method:
hash = hash * 23 + (Trace?.GetHashCode() ?? 0);
Trace
is a list (a reference type), so the hash code is calculated from the memory reference, which is different for all objects. So, objects can never be equal, even if they have the same values. As a result, none of the standard algorithms, such as sorting, filtering, and removing duplicates, worked correctly.
This error isn't particularly unusual; a human could have written it. Well, this is hardly a surprise. We may see new types of errors, but the classic ones should still be present.
Currently, PVS-Studio doesn't have diagnostic rules for such cases. However, we have already created a TODO for implementing them. This is just the beginning, of course, so please send us any other similar cases you have encountered.
I don't think AI-based tools will completely replace classic static analyzers. On the contrary, the demand for them may grow, as they provide a deterministic approach to software quality assurance.
The idea of feeding code generated by one AI system to another for review is appealing. However, there are two major issues.
Firstly, generating code fragments is one thing. Sending your project code to a third-party source for review is an entirely different matter. Many companies are extremely sensitive about keeping their code from being leaked. This is where safety policies, on-premises development, and so on come from. One option is to use locally deployed AI systems, but this significantly increases development costs.
Secondly, not everyone can tolerate total irresponsibility.
A developer isn't to blame if one AI system generates code with a vulnerability and another one fails to detect it. Vibe coding allows for the possibility that a programmer may not fully understand the principles behind their code, especially when it comes to its security.
Developers of AI systems aren't responsible for creating vulnerable code or identifying its vulnerabilities. Who knows how the programmer phrased the prompt. And we're talking about the technology of that scale here...
Nobody's to blame, but the vulnerabilities are there. They'll be fixed, but what's next? What steps should we take to ensure secure software development?
The general answer is to establish secure development processes. One of its important components is static code analysis, specifically the classic type.
Classic analyzers do not guarantee the detection of all potential vulnerabilities. However, they operate deterministically!
If the analyzer fails to detect an error or issues a false positive, it does so in a way that is understandable to humans. Diagnostic rules are described in the documentation. Otherwise, you can ask the analyzer developers to explain how things work. You can even suggest enhancements for the tool if needed.
When it comes to AI, nothing is certain, and there's no accountability. One more question is: what data are the systems trained on? Could someone be secretly feeding vulnerable code to the AI as an example?
Not at the moment, but we're considering experimenting in this area. However, this will just be one more technology for detecting bugs. It will be empirical in nature, which will be clearly stated in the documentation for the detectors.
By the way, PVS-Studio already provides empirical detectors. However, they aren't AI-based; they simply use statistics.
We have no intention of using AI to rewrite anything, though. On the contrary, we recognize the value of classic algorithms in the field of vibe coding :) Additional defect detection mechanisms will simply appear.
Experimenting with filtering false positives also makes sense. People often immediately recognize false positives, but it's difficult to code the system to eliminate such cases. I think AI algorithms would work well here.
We invite you to share examples of bugs that occurred during vibe coding, using AI assistants, etc. in the comments. You can also send them via the feedback form.
Bugs that appeared during further manual tweaking of such code are equally interesting. Cases where the code works, but could easily be broken by changes, are also intriguing. Static analyzers highlight not only errors but code smells as well.
Thank you all in advance.
This isn't the first time I've compiled a collection of errors. Sometimes, people respond with comments like, "Ugh, they're developing a commercial project at my expense."
Okay, if you cherish these bugs, keep them to yourself :) I believe sharing is mutually beneficial, as programmers will eventually have an additional way to check the quality of their code.
It's a kind of mutual benefit when our team checks open-source projects. We get content to publish. Project developers get a chance to fix a bunch of bugs and start using free PVS-Studio licenses.
The text was written entirely by a human, Andrey Karpov, without the use of AI.
0