Exploiting ChatGPT’s Summarization Feature: The ChatGPhish Attack
A newly identified vulnerability in ChatGPT, termed ChatGPhish, enables attackers to transform any web page into a phishing platform by exploiting the AI’s page summarization feature. This technique allows malicious actors to embed deceptive links, counterfeit security alerts, and QR codes directly within the trusted ChatGPT interface, posing significant security risks to users.
Understanding the ChatGPhish Attack
Researchers at Permiso have unveiled the ChatGPhish attack, which builds upon previous trust-transfer exploits observed in AI systems like Microsoft Copilot. In these instances, attackers manipulated AI-generated summaries through Cross Prompt Injection Attacks (XPIA). ChatGPhish extends this concept by targeting the browser environment, where users frequently request ChatGPT to summarize content from various web pages, including GitHub repositories, documentation sites, blogs, and SaaS dashboards.
By appending a concise instruction payload to any publicly accessible web page, an unauthenticated attacker can influence how ChatGPT structures and presents its summarization output. This manipulation is possible because ChatGPT’s response renderer trusts Markdown links and image URLs from third-party content, leading to several attack vectors:
1. User Interface Redress/Phishing: Attacker-controlled Markdown links appear as live, clickable elements within the ChatGPT interface without origin labeling, making it challenging for users to distinguish between legitimate and malicious URLs.
2. Spoofed System Alerts: The renderer displays attacker-supplied text styled as authentic account security notifications, leveraging the visual trust associated with ChatGPT’s interface.
3. QR Code Exploitation: Auto-rendered QR code images fetched from attacker-controlled servers bypass desktop URL defenses, as the malicious destination becomes apparent only after scanning the code on a secondary device.
4. Passive Tracking Beacons: Markdown images embedded via URL shorteners are auto-fetched upon rendering, leaking the victim’s IP address, User-Agent, Referer header, and precise timing information to attacker-controlled infrastructure.
The Implications of ChatGPhish
The ChatGPhish attack is particularly concerning due to its ability to inject malicious content that appears indistinguishable from genuine ChatGPT responses. As highlighted in the OWASP LLM01:2025 guidelines, prompt injection poses a significant risk because Large Language Models (LLMs) struggle to differentiate between legitimate instructions and attacker-supplied content embedded in retrieved data. Once processed, this malicious content surfaces within the ChatGPT response window, styled identically to authentic assistant output, complete with formatted alerts, clickable links, and inline images.
Traditional web security measures, such as the browser’s same-origin policy, offer no protection against this attack. The AI assistant operates within the user’s authenticated context, rendering conventional web security boundaries ineffective.
Discovery and Response
Permiso submitted the initial vulnerability report to OpenAI via Bugcrowd on April 29, 2026, citing Untrusted Markdown Rendering Leads to XSS, Phishing, and Data Exfiltration. OpenAI initially responded that the report could not be reproduced. A revised submission on May 1, 2026, with expanded proof-of-concept steps, was subsequently classified as a duplicate.
Recommendations for Users
Given the potential risks associated with the ChatGPhish vulnerability, users are advised to exercise caution when using ChatGPT’s summarization feature, especially with content from untrusted sources. Until OpenAI addresses this issue, users should remain vigilant for unexpected links, alerts, or QR codes within ChatGPT responses and avoid interacting with them unless their authenticity can be verified.