Amongst various cybersecurity threats, the ShellTorch attack exposes the PyTorch Model Server to remote code execution.
The cybersecurity researchers at the Oligo Security research team have unveiled a series of critical vulnerabilities within the PyTorch Model Server, also known as TorchServe.
Dubbed ShellTorch by researchers; these vulnerabilities are troubling for the artificial intelligence (AI) and machine learning (ML) community, as they open the door for remote code execution and potential server takeovers.
PyTorch is a machine learning framework, based on the Torch library, and is well known for its versatile applications encompassing computer vision, natural language processing, and more. Initially created by Meta AI, this influential framework has now found its home at the Linux Foundation. PyTorch is a foundational component in the ever-evolving domain of AI and machine learning technologies.
Massive Impact on High-Profile Organizations
Oligo Security’s research has identified thousands of vulnerable instances of TorchServe publicly exposed on the internet, with some belonging to the world’s largest and most prominent organizations.
This discovery leaves these organizations susceptible to unauthorized access and the insertion of malicious AI models, posing a significant threat to millions of services and their end-users.
PyTorch’s Dominance Draws Attackers
PyTorch, a powerhouse in machine learning research and widely adopted in the AI industry, has attracted the attention of threat actors. Oligo Security’s research reveals that these new critical vulnerabilities allow remote code execution without any authentication, putting PyTorch-based systems at immediate risk.
The TorchServe Ecosystem
TorchServe, a popular model-serving framework for PyTorch, enjoys broad usage across the AI landscape. Maintained by Meta and Amazon, this open-source library boasts over 30,000 PyPi downloads monthly and more than a million DockerHub pulls.
Its commercial users include industry giants like Walmart, Amazon, OpenAI, Tesla, Azure, Google Cloud, Intel, and many more. Furthermore, TorchServe serves as the foundation for projects like KubeFlow, MLFlow, and AWS Neuron, and is offered as a managed service by leading cloud providers.
Revealing the Vulnerabilities
Oligo Security’s findings highlight vulnerabilities affecting all TorchServe versions prior to 0.8.2. These vulnerabilities, when exploited in sequence, result in remote code execution, granting attackers full control over victims’ servers and networks and enabling the exfiltration of sensitive data.
The Anatomy of a ShellTorch Attack
To comprehend the gravity of the situation, it’s crucial to understand how these vulnerabilities combine to create the ShellTorch attack:
Vulnerability #1 – Abusing the Management Console: Unauthenticated Management Interface API Misconfiguration
TorchServe exposes a management API with a misconfiguration vulnerability that allows external access. This misconfiguration, seemingly innocuous in the default configuration, leaves the door open for malicious actors.
Vulnerability #2 – Malicious Model Injection: Remote Server-Side Request Forgery (SSRF) that Leads to Remote Code Execution – CVE-2023-43654
TorchServe’s default configuration accepts all domains as valid URLs, leading to an SSRF vulnerability. Attackers can exploit this to upload a malicious model, resulting in arbitrary code execution.
Vulnerability #3 – Exploiting an Insecure Use of Open Source Library: Java Deserialization Remote Code Execution – CVE-2022-1471
A misuse of the SnakeYAML library in TorchServe opens a door for attackers to trigger an unsafe deserialization attack, enabling code execution on the target machine.
The Result: Total Takeover
These vulnerabilities collectively empower attackers to execute code remotely with high privileges, bypassing authentication. Once inside, attackers can compromise TorchServe servers globally, potentially affecting tens of thousands of IP addresses.
Security Risks in AI: Impacts and Implications
The integration of open-source tools into AI production environments creates a delicate balance between innovation and vulnerability. Oligo’s findings echo concerns raised in the recent OWASP Top 10 for LLM Applications, touching on supply chain vulnerabilities, model theft, and model injection.
Updates by Amazon and META
According to Oligo Security’s report, on October 2nd 2023, both Amazon and Meta took swift actions in response to the ShellTorch vulnerabilities. Amazon proactively issued a security advisory for its users, highlighting the critical nature of the threat.
Simultaneously, Meta acted by promptly addressing the default management API misconfiguration, implementing measures to mitigate this vulnerability within the PyTorch ecosystem.
It shocked our researchers to discover that – with no authentication whatsoever – we could remotely execute code with high privileges, using new critical vulnerabilities in PyTorch open-source model servers (TorchServe). These vulnerabilities make it possible to compromise servers worldwide. As a result, some of the world’s largest companies might be at immediate risk.
Oligo Security
Mitigation: Protecting Against ShellTorch Attacks
To safeguard TorchServe systems from ShellTorch attacks, three key steps are essential:
- Update to Version 0.8.2 or Above: While this update adds a warning about the SSRF vulnerability, it is a critical first step in mitigating the risk.
- Configure the Management Console: Adjust the configuration to ensure that the management console is accessible only from trusted sources, preventing remote access by attackers.
- Control Model Fetching: Limit TorchServe’s ability to fetch models from trusted domains only, preventing malicious model injections.
In conclusion, the discovery of these vulnerabilities highlights cybersecurity threats in the AI and machine learning sector. As the industry continues to grow at a rapid pace, organizations must remain vigilant and proactive in addressing potential security risks in their AI infrastructure.
RELATED TOPICS
- mLearning – Future of On-The-Go Dynamic Training Programs
- The Role of DevOps in Streamlining Cloud Migration Processes
- Using GenAI in Your Business? Here Is What You Need To Know
- Mozilla Rushes to Fix Critical Vulnerability in Firefox, Thunderbird
- WinRAR users update your software as 0-day vulnerability is found