NVIDIA’s Incomplete Patch for Critical Flaw Exposes AI Model Data to Theft

In September 2024, NVIDIA released a security update to address a critical vulnerability, designated as CVE-2024-0132, in its Container Toolkit. This flaw, with a CVSS v3.1 severity score of 9.0, allows attackers to escape container isolation, granting unauthorized access to the host file system and sensitive data. Despite the patch, subsequent analyses revealed that the fix was incomplete, leaving systems vulnerable under certain configurations.

Trend Research’s October 2024 analysis highlighted that versions 1.17.3 and earlier of the NVIDIA Container Toolkit remain susceptible under default settings. Even version 1.17.4 is exploitable if the `allow-cuda-compat-libs-from-container` feature is enabled. This time-of-check time-of-use (TOCTOU) flaw enables attackers to bypass container restrictions, potentially compromising entire systems.

The incomplete patch poses significant risks, especially for AI-driven industries where proprietary models and data are invaluable. An attacker exploiting this vulnerability could craft malicious container images, deploy them on target systems, and leverage race conditions to access the host file system. This access could be used to execute arbitrary commands with root privileges, leading to full system control.

Compounding the issue, researchers discovered a performance flaw in Docker on Linux that could facilitate denial-of-service (DoS) attacks. This flaw arises when containers use multiple mounts with `bind-propagation=shared`, causing persistent entries in the Linux mount table even after container termination. Over time, this leads to uncontrolled growth, exhausting file descriptors, preventing new container creation, and spiking CPU usage. In severe cases, users may lose access to the host via SSH, effectively locking them out.

The Docker security team acknowledged that the issue might stem from Docker’s runtime or the Linux kernel’s mount handling. They emphasized that the Docker API grants root-level privileges to anyone with access, underscoring the risk. Both Moby and NVIDIA independently reported similar findings, urging immediate attention.

Organizations utilizing NVIDIA’s Container Toolkit or Docker on Linux, particularly those running AI workloads, are at heightened risk. Industries such as healthcare, finance, and autonomous systems, which rely heavily on machine learning, must be especially vigilant. Default configurations in Toolkit versions 1.17.3 and earlier are vulnerable, while version 1.17.4 requires specific feature activation for exploitation. Docker users face the DoS threat, impacting individual systems and potentially entire clusters in shared environments.