Emerging Threat: ‘Man-in-the-Prompt’ Attacks Compromise AI Tools

A recent cybersecurity analysis has unveiled a critical vulnerability affecting widely-used AI platforms such as ChatGPT, Google Gemini, and other generative AI tools. This newly identified threat, termed the “Man-in-the-Prompt” attack, enables malicious browser extensions to exploit the Document Object Model (DOM) to inject prompts, exfiltrate sensitive data, and manipulate AI responses without necessitating special permissions.

Understanding the Vulnerability

The core of this vulnerability lies in the integration of generative AI tools with web browsers through DOM manipulation. When users engage with large language model (LLM)-based assistants, the prompt input fields become accessible to any browser extension possessing basic scripting capabilities. This architectural flaw allows attackers to perform prompt injection attacks by modifying user inputs or embedding hidden instructions directly into the AI interface. Consequently, a “man-in-the-prompt” scenario emerges, where adversaries can read and write AI prompts undetected.

Scope of the Threat

The implications of this vulnerability are extensive, affecting billions of users across major platforms. Notably, ChatGPT, with its 5 billion monthly visits, and Google Gemini, boasting 400 million users, are particularly susceptible. Research indicates that 99% of enterprise users have at least one browser extension installed, with 53% maintaining more than ten extensions. This widespread use of extensions amplifies the risk, as even those requiring no special permissions can access commercial LLMs, including ChatGPT, Gemini, Copilot, Claude, and Deepseek.

Demonstrated Exploits

Two significant proof-of-concept attacks underscore the severity of this vulnerability:

1. ChatGPT Exploit: A compromised browser extension, operating through a command-and-control server, was used to open background tabs, inject prompts into ChatGPT, exfiltrate the AI’s responses to external logs, and delete chat history to conceal its activities. This sophisticated attack chain operates entirely within the user’s session boundaries, making detection exceedingly challenging.

2. Google Gemini Exploit: The attack targeted Google Gemini’s integration with Google Workspace, which provides access to emails, documents, contacts, and shared folders. The vulnerability allowed extensions to inject queries even when the Gemini sidebar was closed, enabling attackers to extract confidential corporate data at scale.

These demonstrations highlight the potential for significant data breaches and underscore the need for immediate attention to this security flaw.

Mitigation Strategies

Internal LLMs are particularly vulnerable due to their access to proprietary organizational data, including intellectual property, legal documents, financial forecasts, and regulated records. Unlike public models, internal copilots often lack robust security measures against adversarial input, operating under the assumption of trusted usage within corporate networks. This false sense of security poses significant risks, including intellectual property leakage, regulatory violations under GDPR and HIPAA, and erosion of organizational trust in AI tools.

To effectively mitigate this threat, organizations must transition from application-level controls to browser behavior inspection. Key strategies include:

– Monitoring DOM Interactions: Implementing systems to detect and analyze interactions within the DOM of AI tools can help identify and prevent unauthorized prompt injections.

– Behavioral Risk Assessment for Extensions: Moving beyond static permission analysis to assess the behavior of browser extensions can aid in identifying potentially malicious activities.

– Real-Time Browser-Layer Protection: Establishing mechanisms to prevent prompt tampering through real-time monitoring and intervention at the browser level can enhance security.

Traditional URL-based blocking is insufficient for internal tools hosted on whitelisted domains, emphasizing the need for comprehensive browser extension sandboxing and dynamic risk assessment capabilities.

Conclusion

The “Man-in-the-Prompt” attack represents a significant and evolving threat to the security of generative AI tools. As organizations increasingly rely on these platforms for various applications, understanding and mitigating such vulnerabilities is paramount. By adopting proactive security measures and fostering a culture of vigilance, enterprises can safeguard their data and maintain trust in AI technologies.