Critical Vulnerability in NVIDIA Container Toolkit Threatens AI Cloud Services

A significant security flaw has been identified in the NVIDIA Container Toolkit, posing a substantial risk to managed AI cloud services. This vulnerability, designated as CVE-2025-23266 and nicknamed NVIDIAScape by the cloud security firm Wiz, carries a critical severity rating with a CVSS score of 9.0 out of 10.

The NVIDIA Container Toolkit is a suite of libraries and utilities that facilitate the creation and execution of GPU-accelerated Docker containers. It is widely utilized in AI and machine learning applications to leverage NVIDIA GPUs for enhanced computational performance. The NVIDIA GPU Operator complements this toolkit by automating the deployment of these containers on GPU nodes within Kubernetes clusters.

According to NVIDIA’s advisory, the vulnerability resides in certain hooks used during container initialization, allowing attackers to execute arbitrary code with elevated permissions. Exploitation of this flaw could lead to privilege escalation, data tampering, information disclosure, and denial-of-service attacks.

The issue affects all versions of the NVIDIA Container Toolkit up to and including 1.17.7, as well as the NVIDIA GPU Operator up to and including 25.3.0. NVIDIA has addressed this vulnerability in versions 1.17.8 and 25.3.1, respectively.

Wiz’s analysis indicates that approximately 37% of cloud environments are susceptible to this vulnerability. An attacker could exploit this flaw to access, steal, or manipulate sensitive data and proprietary models of other customers sharing the same hardware. The exploit can be executed with a simple three-line Dockerfile, making it alarmingly easy to implement.

The root cause of the vulnerability is a misconfiguration in the handling of the Open Container Initiative (OCI) hook createContainer. By setting the LD_PRELOAD environment variable in a Dockerfile, an attacker can instruct the nvidia-ctk hook to load a malicious library. Since the createContainer hook operates with its working directory set to the container’s root filesystem, the malicious library can be loaded directly from the container image, completing the exploit chain.

This discovery follows previous findings by Wiz, including a bypass for another vulnerability in the NVIDIA Container Toolkit (CVE-2024-0132 and CVE-2025-23359) that could have been exploited to achieve complete host takeover.

Wiz emphasizes that while AI security discussions often focus on advanced, AI-based attacks, traditional infrastructure vulnerabilities in the expanding AI technology stack remain an immediate threat that security teams should prioritize. They also highlight that containers should not be solely relied upon as security barriers. For applications, especially in multi-tenant environments, it is advisable to assume the presence of vulnerabilities and implement robust isolation measures, such as virtualization.