AI Coding Agents Vulnerable to ‘Comment and Control’ Prompt Injection Attacks

Exploiting AI Coding Agents: The Rise of ‘Comment and Control’ Prompt Injection Attacks

In the rapidly evolving landscape of software development, the integration of artificial intelligence (AI) into coding workflows has introduced both efficiencies and vulnerabilities. A recent class of security flaws, termed Comment and Control, has emerged, exploiting AI coding agents through malicious inputs embedded in GitHub pull request titles, issue bodies, and comments. This technique enables attackers to hijack AI agents and exfiltrate sensitive information, such as API keys and access tokens, directly from Continuous Integration/Continuous Deployment (CI/CD) environments.

Understanding ‘Comment and Control’ Attacks

The Comment and Control attack methodology is a sophisticated form of prompt injection that manipulates AI agents by embedding harmful instructions within GitHub’s collaborative features. Unlike traditional prompt injections that require user interaction, this approach is proactive. By simply opening a pull request (PR) or submitting an issue, the attack can be initiated without any direct engagement from the victim. This is due to GitHub Actions workflows that automatically trigger on events like `pull_request`, `issues`, and `issue_comment`, allowing the AI agent to process the malicious content autonomously.

Case Studies of Vulnerable AI Agents

1. Claude Code Security Review

Anthropic’s Claude Code Security Review action was found to be susceptible to this form of attack. The PR title is directly incorporated into the agent’s prompt without proper sanitization. Since the Claude Command Line Interface (CLI) is executed without restrictions on tools, it inherits all environment variables, including sensitive ones like `ANTHROPIC_API_KEY` and `GITHUB_TOKEN`. An attacker can craft a PR title that escapes the prompt context and instructs Claude to execute commands such as `whoami`, `ps auxeww`, or `env`, resulting in a comprehensive credential dump posted as a JSON security finding in the PR comments. This vulnerability was rated as Critical with a CVSS score of 9.4 by Anthropic and was reported via HackerOne (#3387969), earning a $100 bounty.

2. Gemini CLI Action

Google’s Gemini CLI Action processes issue titles, bodies, and comments within the agent’s prompt context under an Additional Content section. By injecting a counterfeit Trusted Content Section immediately after, an attacker can override Gemini’s safety instructions, causing the AI to publicly post the `GEMINI_API_KEY` as an issue comment, accessible to anyone. This vulnerability was reported to Google’s Vulnerability Rewards Program (#1609699) by researcher Neil Fendley and collaborators from Johns Hopkins University, resulting in a $1,337 bounty.

3. GitHub Copilot Agent

The GitHub Copilot Agent presented a more complex challenge due to its three-layer defense system comprising environment variable filtering, secret scanning, and network firewall protections. However, attackers managed to bypass these defenses:

– Environment Variable Filtering Bypass: The `UU()` function filters sensitive variables only from the child bash subprocess. By running `ps auxeww` from that subprocess, attackers can read `/proc/[pid]/environ` of the unfiltered parent Node.js process and Model Context Protocol (MCP) server, recovering credentials such as `GITHUB_TOKEN` and `GITHUB_COPILOT_API_TOKEN`.

– Secret Scanning Bypass: GitHub’s scanner detects token prefixes like `ghs_` or `ghu_`. By base64-encoding the output before committing, attackers can evade pattern matching.

– Firewall Bypass: The encoded credential file is exfiltrated via a standard `git push` to a PR, a whitelisted operation, making it indistinguishable from normal Copilot workflow activity.

To further conceal the attack, the payload is hidden within an HTML comment in the issue body, rendering it invisible in GitHub’s rendered Markdown view but fully parsed by the AI agent.

Implications and Recommendations

The emergence of Comment and Control attacks underscores the critical need for robust input validation and sanitization in AI-integrated development environments. Organizations utilizing AI coding agents should implement the following measures:

– Input Sanitization: Ensure that all user-generated content, such as PR titles and issue comments, is thoroughly sanitized before being processed by AI agents.

– Restrict AI Capabilities: Limit the tools and commands that AI agents can execute, preventing unauthorized actions.

– Environment Variable Management: Avoid exposing sensitive environment variables to AI agents unless absolutely necessary.

– Monitor AI Outputs: Treat all AI-generated outputs as untrusted and subject them to rigorous validation before execution.

– Update and Patch: Regularly update AI tools and apply patches to address known vulnerabilities promptly.

By adopting these practices, developers and organizations can mitigate the risks associated with prompt injection attacks and safeguard their CI/CD pipelines from potential exploitation.