New Google Gemini Vulnerability Exploited via Prompt Injections from Messaging Apps
A recent discovery has unveiled a significant security vulnerability within Google’s Gemini voice assistant, exposing users to potential exploitation through indirect prompt injection (IPI) attacks. This flaw allows malicious actors to hijack the AI assistant by embedding harmful commands into notifications from widely used messaging platforms such as WhatsApp, Slack, Signal, SMS, Instagram, and Messenger.
The research, spearheaded by Or Yair, Security Research Team Lead at SafeBreach, builds upon previous findings that demonstrated the manipulation of Google Calendar invitations to exploit Gemini. This new attack vector significantly broadens the potential for exploitation, as any application capable of triggering device notifications can serve as a conduit for these malicious payloads.
Mechanism of the Exploit
The core of this vulnerability lies in Gemini’s Android Utilities agent, particularly the tool responsible for reading incoming notifications. By processing untrusted data from third-party applications, this tool becomes susceptible to manipulation. Attackers can craft messages containing embedded commands that, once processed by Gemini, are executed without the user’s knowledge or consent.
For instance, an attacker might send a message with a hidden command that, when read by Gemini, prompts the assistant to perform unauthorized actions. This could range from sending messages on behalf of the user to accessing sensitive information stored on the device.
Bypassing Existing Defenses
In response to earlier vulnerabilities, Google implemented measures to block chained tool invocations and delayed tool execution. However, researchers at SafeBreach have developed a novel technique called Fake Context Alignment to circumvent these defenses. This method deceives both the user and Gemini’s security mechanisms by presenting a legitimate authorization scenario while executing malicious commands in the background.
Two specific techniques were demonstrated:
1. Obfuscated Fake Context Alignment: This approach involves appending a malicious authorization question in a foreign language immediately followed by a harmless question in the user’s language. The user’s affirmative response to the benign question inadvertently authorizes the hidden command.
2. Muted Fake Context Alignment: In this method, the malicious question is embedded as clickable link text that Gemini’s text-to-speech engine skips over. The user hears only the benign prompt and unknowingly authorizes the execution of the hidden command by responding affirmatively.
By combining these techniques, researchers were able to bypass Google’s latest security measures with high reliability and minimal user awareness.
Potential Exploits and Implications
The implications of this vulnerability are far-reaching, particularly with the proliferation of smart home devices. Attackers could exploit this flaw to remotely control connected appliances such as windows, boilers, and lighting systems via Google Home. More alarmingly, they could initiate covert video streaming by forcing applications like Zoom to launch and stream the device’s camera feed without the user’s consent.
Additionally, this vulnerability opens the door for large-scale social engineering attacks. By fabricating messages from trusted contacts, attackers can deceive users into divulging sensitive information or performing actions that compromise their security.
Mitigation and Recommendations
In light of these findings, it is imperative for users to exercise caution when interacting with notifications from messaging apps, even those from trusted sources. Regularly updating applications and the device’s operating system can help mitigate potential risks. Furthermore, users should be vigilant for any unusual behavior from their AI assistants and report suspicious activities to the relevant authorities.
Google is expected to address this vulnerability in upcoming updates. In the meantime, users are advised to disable Gemini’s notification reading capabilities or limit its access to sensitive applications to reduce the risk of exploitation.