Gaslight macOS Malware Uses Prompt Injection to Evade AI Analysis

A newly identified Rust-based macOS malware, dubbed Gaslight, has emerged with sophisticated techniques aimed at evading detection, particularly by AI-assisted analysis tools. This malware not only functions as an information stealer but also employs prompt injection methods to deceive security systems.

Gaslight’s command-and-control (C2) infrastructure utilizes the Telegram bot API, enabling operators to issue commands and receive execution results through an interactive shell. The malware supports several commands, including:

  • help: Displays command assistance.
  • id: Identifies the implant to the operator.
  • shell: Executes shell commands.
  • kill: Terminates processes by PID.
  • upload: Exfiltrates files via Telegram’s attachment mechanism.
  • stop: Halts the implant’s execution.

Notably, Gaslight achieves persistence by creating a LaunchAgent with the label “com.apple.system.services.activity” in its .plist file. This ensures the malware remains active across system reboots.

The malware also includes a Base64-encoded Python script designed to collect extensive system information. This script gathers data such as terminal command histories, installed applications, running processes, system profiles, Keychain databases, and browser data from Chrome, Brave, Firefox, and Safari. The collected information is compressed into a ZIP archive and uploaded via Telegram.

To deploy the Python script, Gaslight uses a separate Base64-encoded bash installer that drops a Python interpreter from the “astral-sh/python-build-standalone” project. The presence of emojis and extensive comments suggests that this script was likely generated using a large language model (LLM).

One of Gaslight’s most distinctive features is its use of prompt injection to evade AI-based detection systems. It embeds a Markdown-fenced block containing 38 fabricated system messages designed to mislead security agents into aborting or refusing analysis. These messages include fake system errors, warnings about injection vulnerabilities, and static-analysis flags, effectively attacking the perception of AI-assisted triage pipelines.

Additionally, Gaslight does not hard-code critical details such as the bot token and chat ID into the sample. Instead, these configurations are supplied at runtime, and the malware self-redacts its Telegram bot token in its runtime output, denying access to anyone capturing logs or crash artifacts.

The emergence of Gaslight underscores the evolving tactics of threat actors, particularly those aligned with North Korea, in developing malware that not only steals information but also actively evades detection by leveraging prompt injection techniques. This development highlights the need for security professionals to adapt their detection and analysis methods to counteract such sophisticated evasion strategies.