AWS Execution Role Vulnerabilities Expose Privilege Escalation Risks in EC2, SageMaker

Unveiling AWS Execution Role Vulnerabilities: Privilege Escalation Risks in EC2 and SageMaker

In the ever-evolving landscape of cloud computing, security remains a paramount concern. Recent analyses have shed light on a persistent privilege escalation technique within Amazon Web Services (AWS) that could allow attackers with limited permissions to execute code under higher-privileged execution roles on EC2 instances and SageMaker notebook instances.

Understanding the Privilege Escalation Technique

First documented by security researcher Daniel Grzelak in 2016, this method exploits modifiable boot-time configurations to inject malicious payloads, effectively bypassing standard Identity and Access Management (IAM) controls such as PassRole. Despite advancements in cloud security, this pattern persists across AWS services, highlighting ongoing risks in cloud compute environments.

Exploitation in EC2 Instances

Attackers with permissions such as `ec2:StartInstances`, `ec2:StopInstances`, and `ec2:ModifyInstanceAttribute` can target existing EC2 instances attached to powerful instance profiles. By stopping the instance, they can modify the `userData` attribute using a `#cloud-boothook` directive, which triggers script execution on every reboot. Restarting the instance runs the injected code, such as credential exfiltration, in the context of the instance’s execution role, granting access to its full permissions.

This technique remains viable today, as AWS documentation still permits `userData` modifications post-launch. CloudTrail logs reveal the attack through sequences like `StopInstances` → `ModifyInstanceAttribute` → `StartInstances` from unexpected principals.

Privilege Escalation in SageMaker Notebook Instances

Amazon SageMaker notebook instances, powered by managed Jupyter environments, introduce a parallel vector via lifecycle configurations—shell scripts executed on start or creation. Permissions for `sagemaker:StopNotebookInstance`, `sagemaker:UpdateNotebookInstance` (with `lifecycle-config-name`), and `sagemaker:StartNotebookInstance` enable the escalation: halt a notebook, create or attach a malicious lifecycle config with base64-encoded credential-stealing code, then restart.

Grzelak provided proof-of-concept bash code demonstrating the full chain, from config creation to exfiltration via a callback endpoint. SageMaker’s complexity, spanning notebooks, domains, and studios, amplifies exposure, as execution roles often carry broad data science permissions like S3 access or model deployment.

Root Cause and Broader Implications

The core flaw stems from PassRole checks occurring only at resource creation, decoupling role assignment from runtime code changes. Similar patterns affect Lambda (via `UpdateFunctionCode`), CloudFormation change sets, and potentially SageMaker Studios. Attackers can systematically hunt for AWS API endpoints with execution role dependencies, enabling widespread exploitation.

Detection and Mitigation Strategies

Detection relies on CloudTrail monitoring for `Stop` → `Update` → `Start` patterns on compute resources, primarily from non-operational identities. Prevention involves least-privilege scoping around config-modifying actions, Service Control Policies (SCPs) denying broad execution role passages, and approval workflows for restarts.

AWS classifies these as configuration issues under the shared responsibility model, urging teams to audit execution role assumptions rigorously.

Conclusion

As cloud services continue to evolve, so do the tactics employed by malicious actors. Understanding and mitigating privilege escalation techniques within AWS services like EC2 and SageMaker is crucial for maintaining a secure cloud environment. Organizations must remain vigilant, regularly auditing their configurations and permissions to prevent unauthorized access and potential data breaches.