Fake AI Agent Skill Bypasses Security, Reaches 26,000 Agents

In a recent demonstration, security firm AIR developed a counterfeit AI agent skill named ‘brand-landingpage’ to expose vulnerabilities in current security scanning processes. This skill, designed to create landing pages using Google’s Stitch design tool, was intentionally harmless, collecting only users’ email addresses. However, it successfully infiltrated approximately 26,000 agents, including those within corporate environments, without detection.

The skill was strategically introduced into a reputable skill marketplace repository boasting around 36,000 stars and 156 skills. By submitting a pull request, AIR ensured the skill inherited the repository’s credibility. Additionally, an Instagram advertisement targeted at marketers, salespeople, and designers facilitated widespread adoption.

Security scanners from prominent companies like Cisco and NVIDIA, as well as those integrated into skills.sh, failed to identify the skill as malicious. These scanners typically analyze the provided package, including the SKILL.md file and associated files. AIR’s skill cleverly included a directive for the agent to install the ‘Stitch SDK’ by following documentation hosted on a domain controlled by AIR, stitch-design.ai, which initially redirected to legitimate Google Stitch documentation.

Once the skill achieved significant distribution, AIR altered the content behind the link to instruct agents to download and execute a script. In this controlled experiment, the script merely sent the user’s email address back to AIR. However, in a real-world scenario, such a foothold could be exploited to access files, transfer data, or interact with internal systems, depending on the agent’s permissions.

This incident underscores a critical flaw in the current security scanning paradigm: while scanners evaluate static packages, they remain oblivious to external links that can be modified post-review. This oversight allows attackers to maintain clean submissions while hosting malicious payloads externally, a tactic that has been observed in real campaigns for months.

To mitigate such risks, it’s imperative to treat skills as dynamic software entities rather than static text. Comprehensive vetting should encompass not only the skill’s immediate content but also any external resources it references. Continuous monitoring of these external links is essential to detect and respond to any malicious alterations promptly.

This case serves as a stark reminder of the evolving nature of cyber threats and the necessity for adaptive security measures. Organizations must remain vigilant, ensuring that both internal and external components of their software ecosystems are scrutinized and monitored to prevent exploitation.