New “Slopsquatting” Threat Emerges from AI-Generated Code Hallucinations

New “Slopsquatting” Threat Emerges from AI-Generated Code Hallucinations

AI code tools often hallucinate fake packages, creating a new threat called slopsquatting that attackers can exploit in public code repositories, a new study finds.

A new study by researchers from the University of Texas at San Antonio, the University of Oklahoma, and Virginia Tech has shown that AI tools designed to write computer code frequently make up software package names, a problem called “package hallucinations.”

It leads to recommendations for convincing-sounding but non-existent software package names, which can mislead developers into believing they are real and potentially push them to search for the non-existent packages on public code repositories.

This could allow attackers to upload malicious packages with those same hallucinated names to popular code repositories, where unsuspecting developers will assume they’re legitimate and incorporate them into their projects.

This new attack vector, called slopsquatting, is similar to traditional typosquatting attacks, with the only difference being that instead of subtle misspellings, it uses AI-generated hallucinations to trick developers.

Researchers systematically examined package hallucinations in code-generating Large Language Models (LLMs), including both commercial and open-source models and found that a significant percentage of generated packages are fictitious. For your information, LLMs are a type of artificial intelligence that can generate human-like text and code.

The universities analysed around 16 widely used code-generating LLMs and two prompt datasets to understand the scope of the package hallucination problem. Some 576,000 code samples were generated in Python and JavaScript. According to the research, shared exclusively with Hackread.com, “package hallucinations were found to be a pervasive phenomenon across all 16 models tested.”

Also, this issue was prevalent across both commercial and open-source models, and commercial LLMs like GPT-4 hallucinate less often than open-source models. “GPT series models were found to be 4 times less likely to generate hallucinated packages compared to open-source models,” researchers noted (PDF).

Another observation was that the way LLMs are configured can influence the rate of hallucinations. Specifically, lower temperature settings in LLMs reduce hallucination rates, while higher temperatures dramatically increase them.

What’s even more concerning is that LLMs tend to repeat the same invented package names because “58% of the time, a hallucinated package is repeated more than once in 10 iterations,” the research indicates. This means the problem isn’t just random errors but a consistent behaviour, making it easier for hackers to exploit.

Furthermore, it was discovered that LLMs are more likely to hallucinate when prompted with recent topics or packages and generally struggle to identify their own hallucinations.

New “Slopsquatting” Threat Emerges from AI-Generated Code Hallucinations
The screenshot shows the exploitation of package hallucination (Credit: arxiv)

Researchers agree that package hallucinations are a novel form of package confusion attack, asserting that code-generating LLMs should adopt a more “conservative” approach in suggesting packages, sticking to a smaller set of well-known and reliable ones.

These findings highlight the importance of addressing package hallucinations to enhance the reliability and security of AI-assisted software development. Researchers have developed several strategies to reduce package hallucinations in code-generating LLMs.

These include Retrieval Augmented Generation (RAG), self-refinement, and fine-tuning. They also emphasize showing commitment to open science by making their source code, datasets, and generated code publicly available, except for the master list of hallucinated package names and detailed test results.

Casey Ellis, Founder of Bugcrowd, commented on the rise of AI-assisted development, noting that while it boosts speed, it often lacks the matching rise in quality and security. He warned that over-trusting LLM outputs and rushing development can lead to issues like slopsquatting, where speed trumps caution. “Developers aim to make things work, not necessarily to prevent what shouldn’t happen,” Ellis said, adding that this misalignment, amplified by AI, naturally leads to these types of vulnerabilities.

Deeba is a veteran cybersecurity reporter at Hackread.com with over a decade of experience covering cybercrime, vulnerabilities, and security events. Her expertise and in-depth analysis make her a key contributor to the platform’s trusted coverage.
Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts