GitHub Copilot Chat Vulnerability Exposes Sensitive Data from Private Repositories

A recent investigation by Legit Security has uncovered a significant vulnerability within GitHub’s Copilot Chat AI assistant, leading to the unintended exposure of sensitive information from private repositories. This flaw not only compromised confidential data but also allowed unauthorized manipulation of Copilot’s responses.

Understanding the Vulnerability

GitHub Copilot Chat is designed to assist developers by providing code explanations and suggestions. It includes a feature that enables users to hide content within rendered Markdown using HTML comments. While these hidden comments trigger standard pull request notifications to repository owners without displaying the concealed content, they inadvertently inject prompts into other users’ contexts. This oversight permitted attackers to influence Copilot into suggesting malicious code to unsuspecting users.

Exploitation Techniques

Omer Mayraz of Legit Security identified that by combining a Content Security Policy (CSP) bypass with remote prompt injection, it was possible to extract sensitive data, such as AWS keys and undisclosed vulnerabilities, from private repositories. Furthermore, attackers could craft prompts instructing Copilot to access users’ private repositories, encode their contents, and append them to a URL. When a user clicked on this URL, their data would be exfiltrated back to the attacker.

Challenges in Data Exfiltration

GitHub’s stringent CSP is designed to prevent data leakage by blocking the fetching of images and other content from non-GitHub domains. This security measure complicates attempts to exfiltrate data by injecting HTML `` tags into a victim’s chat. To circumvent this, GitHub employs Camo, an open-source project that generates anonymous URL proxies for external images included in README or Markdown files. Camo rewrites external URLs to proxy URLs and fetches the original content only if the URL is signed by GitHub, thereby preventing unauthorized data exfiltration.

Bypassing GitHub’s Protections

To exploit this system, Mayraz created a dictionary mapping all letters and symbols to their corresponding Camo URLs, embedding this dictionary into the injected prompt. He also set up a web server to respond with a 1×1 transparent pixel to each request. By constructing the prompt to trigger the vulnerability, he demonstrated how sensitive content from repositories could be leaked.

Proof of Concept and GitHub’s Response

Mayraz published proof-of-concept videos showcasing the exfiltration of zero-day vulnerabilities and AWS keys from private repositories. Upon being notified of the issue on August 14, GitHub addressed the vulnerability by disallowing the use of Camo for leaking sensitive user information.

Implications for Developers

This incident underscores the critical importance of robust security measures in AI-assisted development tools. Developers are advised to remain vigilant, regularly review their repositories for unauthorized changes, and stay informed about potential vulnerabilities in the tools they use.