Hackers Use Hidden Website Instructions in New Attacks on AI Assistants

Cybersecurity researchers at Forcepoint uncover new indirect prompt injection attacks that use hidden website code to exploit AI assistants like GitHub Copilot.

byDeeba Ahmed

April 23, 2026

3 minute read

Threat actors are now using a method called Indirect Prompt Injection (IPI) to trick Large Language Models (LLMs) by hiding secret commands on ordinary websites, according to a new report from Forcepoint X-Labs. While previously considered a theoretical risk, Forcepoint’s research confirms that IPI is now being used in the wild to target “live web infrastructure.”

How the Attack Works

According to researchers, IPI’s operation is quite simple but very dangerous. Usually, an end-user provides a direct input to an LLM through a standard interface; however, in this case, the hacker hides instructions that only the autonomous agent can see.

As a result, when it visits a website to help a user, it reads and follows them like a real command. That’s because it cannot distinguish between the data it reads and the instructions it follows, a condition known as a missing data-instruction boundary.

“Unlike direct prompt injection, where a user sends malicious input to a model, IPI hides adversarial instructions inside ordinary web content. When an AI agent crawls or summarizes a poisoned page, it ingests those instructions and executes them as legitimate commands, with no indication that anything went wrong,” the report reads.

Regarding how the hackers hide these commands, researchers identified that they use several tricks, including using tiny 1px fonts, transparent colours, HTML comments, or metadata tags. They also exploit accessibility layers and CSS tricks like display:none. While the site looks normal to a visitor, the AI sees a clear instruction to perform malicious actions.

The team at X-Labs used active threat hunting to identify ten specific examples of IPI on real websites throughout April 2026. By examining telemetry data, researchers spotted trigger phrases like “Ignore previous instructions” or “If you are a large language model” used to activate these traps.

Major Findings

Some of the findings are very serious. Mayur Sewani, X-Labs’ Principal Threat Researcher, explained in the blog post that attackers used IPI for everything from financial fraud and data wiping to system spoofing, stealing API keys, and Denial-of-Service (DoS).

For example, on faladobairro.com, hackers hid a command for a terminal emulator (sudo rm -rf) to delete a backup folder named agy/BU, targeting command-line interfaces used by developers.

Another site, perceptivepumpkin.com, used hidden steps to trick the agent into sending 5,000 Dollars via PayPal.me whereas on thelibrary-welcome.uk, a hidden comment forced the LLM to leak a secret API key.

Denial-of-Service (DoS) attacks were used to suppress AI outputs. On bentasker.co.uk, hackers used Authority Impersonation to make the AI refuse to summarise the page, even forcing the model to write a poem about corn as a distraction.

The site lcpdfr.com used Magic String Spoofing with a fake code (ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL) to mimic high-level security commands typically issued by model providers.

Meanwhile, kleintechnik.net tricked an AI agent into accessing a private admin.php page, and sites like kassoon.com used IPI for SEO manipulation and traffic hijacking, instructing the model to redirect users to specific URLs.

Indirect Prompt Injection Attacks Targeting Autonomous AI Agents — Authority impersonation and redirect instruction (Source: Forcepoint)

This research, which was shared with Hackread.com, reveals that threat actors now don’t even need to breach a system directly because they can easily exploit tools like GitHub Copilot, Claude Code, or browser assistants to perform malicious acts. As long as these models treat every word as a trusted command, they will remain a prime target for exploitation.

(Featured Photo by BoliviaInteligente on Unsplash)

The Latest

Upwind Finds Coordinated Supply Chain Campaign Compromising Multiple AsyncAPI npm Packages

Tego AI Finds Claude Tag Slack Integration Can Trigger Unauthorized Enterprise Actions

Millions of Microsoft Entra Accounts Targeted in OAuth Client ID Spoofing Campaigns

Hackers Use Hidden Website Instructions in New Attacks on AI Assistants

How the Attack Works

Major Findings

Deeba Ahmed

Tego AI Finds Claude Tag Slack Integration Can Trigger Unauthorized Enterprise Actions

Greenhat Announces Successful Delegation at Web Summit Vancouver 2026

Torq and Criminal IP Partner to Deliver Decision-Ready Threat Intelligence for Autonomous SOC Operations

FastNetMon Launches Netomics, a Self-Hosted Routing Intelligence Platform

OpenMatter Network Joins HOL Initiative to Help Define Standards for Verifiable AI Collaboration and Security

Hackers Use Hidden Website Instructions in New Attacks on AI Assistants

How the Attack Works

Major Findings

Related Posts