Google Cloud Vertex AI Vulnerability Exposes Models to Hijacking

A recently disclosed vulnerability in Google Cloud’s Vertex AI platform could have allowed attackers to hijack machine learning model uploads and execute malicious code within victim environments. This issue, identified by Unit42 researchers, affects specific versions of the Vertex AI Python SDK and arises from predictable cloud storage bucket naming combined with inadequate ownership validation.

Vertex AI is a comprehensive platform for building and deploying machine learning models. When developers upload models using the SDK, artifacts are temporarily staged in a Google Cloud Storage (GCS) bucket before deployment. The vulnerability occurs when users do not specify a staging bucket, prompting the SDK to generate one using a predictable naming pattern. Critically, the SDK verifies only the existence of the bucket, not its ownership, creating an opportunity for exploitation.

This flaw enables a technique known as “bucket squatting,” where an attacker pre-creates the expected bucket name within their own project. Consequently, the victim’s model artifacts are inadvertently uploaded to the attacker’s infrastructure. The exploitation method, termed “Pickle in the Middle” by researchers, leverages Python’s pickle deserialization to achieve code execution.

The attack unfolds in several stages:

  • The attacker predicts the victim’s default bucket name and creates it in their own project with permissive access controls.
  • When the victim uploads a model, the SDK unknowingly sends artifacts to the attacker’s bucket.
  • A malicious cloud function detects the upload and replaces the model file within milliseconds.
  • The poisoned model is later deployed by Vertex AI infrastructure.
  • During model loading, pickle deserialization executes attacker-controlled code.

This process occurs within a narrow race window of approximately 2.5 seconds, allowing the attacker to swap the model before it is consumed by Google’s service agent.

Successful exploitation enables full remote code execution inside Vertex AI serving environments. In proof-of-concept testing, attackers were able to:

  • Extract service account tokens from the metadata server.
  • Access other models stored in the same tenant environment.
  • Enumerate BigQuery datasets and permissions.
  • Gather internal infrastructure details from cloud logs.

Notably, the compromised credentials carried broad cloud-platform scope, significantly increasing the potential impact of the attack.

According to Unit42 researchers at Palo Alto Networks, the vulnerability stems from the SDK’s staging logic in the `gcs_utils.py` module, where bucket names are generated predictably and validated only for existence, without verifying ownership. This design flaw allowed cross-project resource abuse, effectively breaking isolation boundaries between different cloud projects.

In response to the disclosure, Google has released patches to address the issue. Users are strongly advised to update their Vertex AI Python SDK to the latest version and to specify custom, securely configured staging buckets to mitigate potential risks.

This incident underscores the importance of rigorous security practices in cloud-based AI development. Organizations should implement strict access controls, regularly audit cloud resources, and ensure that all components of their AI pipelines are configured securely to prevent similar vulnerabilities.