A Guide to Container Privilege Escalation Vulnerabilities
One of the many promises of containers is isolation.
Through careful use of Linux namespaces and cgroups, containers create a sandbox where you can run processes independently of each other. This, combined with a delightful developer experience, saw containers gain popularity amongst engineers and security folks alike.
But is that really enough?
And can you say for certain that your workloads are safe should one of them be compromised?
Well, in this post, we’ll take every attacker's dream and sysadmins’ nightmare and tell you how and why it happens, and ways to prevent it before you end up on the front page of Hacker News. For more Docker container security vulnerabilities, check: Top 9 Docker Container Security Vulnerabilities.
Container Isolation and Its Security Gaps
A key detail often overlooked with containers is that they all share the same host. This means each container is only as secure as the next. If the host or a single container is compromised, it could spell disaster for your infrastructure.
This doesn’t mean you should throw out containers altogether. No. The point here is to understand the fine print.
Isolation starts to break down when attackers find a way to escalate privileges. Escalating privileges is one of the most thrilling yet challenging moves in offensive security attacks. It turns a low-privileged foothold into a full system compromise.
Container privilege escalation vulnerabilities make bypassing security boundaries less challenging and far more rewarding because a single container is all it takes to gain access to a host.
This isn't just theory; the infamous 2019 runC vulnerability (CVE-2019-5736) enabled attackers to overwrite the host runC binary and seize host root access.
These attacks are far from flash, which is why you might run into questions like this:

More recently, vulnerabilities like CVE-2024-21626 and subtle layer-based attacks have shown us that even non-root containers can become dangerous when you misuse Linux capabilities, mounts, or filesystem layering.
That’s why, building on our previous guide to Docker vulnerabilities, we’ll explore how these attacks work, the recent CVEs that highlight their growing risks, and how to prevent them.
Recent Container Privilege Escalation Vulnerabilities
While not all container vulnerabilities lead to privilege escalation, the most dangerous ones often do. Some vulnerabilities bypass container isolation and seize elevated privileges like no other.
The following are three recent container privilege escalation vulnerabilities.
CVE-2024-21626: runC Working Directory Breakout
CVE-2024-21626 is a high-severity container escape vulnerability that affects the widely used container runtime, runC. The vulnerability arises from an unsafe use of the PR_SET_NO_NEW_PRIVS prctl call with execve, allowing an attacker to bypass the no_new_privs flag when the container was started with overly permissive mounts or capabilities.
The no_new_privs flag is one of the core mitigations against privilege escalation that many container engines rely on for sandboxing.
The flaw was introduced in runC versions starting from v1.1.11 and was fixed in v1.1.12.
CVE-2023-2640 and CVE-2023-32629 in the Ubuntu Kernel OverlayFS Module
OverlayFS is one of the fundamental building blocks of containers and Kubernetes. It layers two directories (lowerdir and upperdir) on a single Linux host and presents them as a single directory, providing better performance for layer-related Docker commands such as docker build and docker commit.
CVE-2023-2640 and CVE-2023-32629 enable a malicious actor to use a non-root privileged container with volume mount to escalate privileges. This is possible because volume mounts are treated as separate disks and can be used to create OverlayFS, which exists outside the container layer. This enables attackers to bypass standard container restrictions and manipulate extended file attributes on mounted volumes, which are then copied to the upper layer with elevated capabilities (such as CAP_SETUID or CAP_SYS_ADMIN) intact.
CVE-2022-0492
On March 4th, 2022, security researchers uncovered a critical flaw in the Linux kernel cgroup_release_agent_write that had the potential to allow container escape and take control over the entire node on which the container runs.
Cloud providers and Linux distros moved quickly. Patches were issued, advisories were published. But beneath the surface, complexity remained. Unlike other software, the Linux kernel lacks a standardized versioning scheme across distributions.
Meaning applying these patches and upgrading the base images took months. During which environments remained susceptible to potential attack.
Preventing Privilege Escalations in Docker and Docker Compose
The best way for you to prevent privilege escalation attacks from within a Docker container is to configure your container's applications to run as unprivileged users and then make sure the containers don’t have privileged access.
In practice, to prevent privilege escalation in Docker, you need to do the following:
- Lock privilege escalation
- Set a proper non-root user
- Drop Linux kernel capabilities
You have two options: either at runtime or within your Docker Compose configuration.
At runtime, use the following flags:
docker run \
--read-only \
--security-opt=no-new-privileges \
--network your-isolated-network \
--cap-drop ALL
--cap-add CHOWN \
--pids-limit 99 \
--user=your-user \ # your non-root user.
... # OTHER OPTIONS GO HERE
your-app:v1.0.1
With Docker Compose, the configuration will be:
services:
webapp:
image: your-app:v1.0.1
read_only: true
security_opt:
- no-new-privileges:true
networks:
- your-isolated-network
cap_drop:
- ALL
cap_add:
- CHOWN
# OTHER OPTIONS GO HERE
...
Running Containers that Require root User Access
There are cases (running system utilities, legacy services, etc) where a container might require root access to run. In these cases, you can re-map this user to a less-privileged user on the Docker host.
Remapping a container UID ensures that even though the container thinks it’s running as root, it actually operates as an unprivileged user from the host’s perspective, which reduces the risk of container breakout and host compromise.
You can refer to the Docker documentation on user namespace remapping for detailed steps and best practices.
Preventing Privilege Escalation in Kubernetes
In Kubernetes, the best way to prevent privilege escalation is during container creation, using the spec.containers.securityContext configuration. This is because once you create a container with over-permissive privileges, it cannot be updated at runtime, and you will have to delete and recreate it.
The following securityContext secures the app container from privilege escalation, even with Linux versions of unpatched CVE-2022-0492 as mentioned above.
containers:
- name: app
image: myapp:bullseye-20230912
securityContext:
runAsUser: 1000 # Use a non-root user ID
runAsGroup: 1000 # Use a non-root group ID
runAsNonRoot: true # Explicitly enforce non-root execution
allowPrivilegeEscalation: false # Prevent the process from gaining more privileges
readOnlyRootFilesystem: true # Prevent writes to the root filesystem
capabilities:
drop:
- ALL # Drop all capabilities to minimize attack surface
add:
- NET_BIND_SERVICE # Add back only the capabilities the application needs
seccompProfile:
type: RuntimeDefault # Use the container runtime's default seccomp profile
Another great way to prevent privilege escalation in Kubernetes is to prevent Pods without a security context from being scheduled at all. Kubernetes admission controllers allow you to intercept requests to the API server, e.g, a deployment, and validate that certain conditions are met before being processed.
For more advanced use cases, you can write an admission controller from scratch; however, open source solutions like Kyverno allow you to write cluster-wide policies that can patch deployments that do not have a security context set. This typically looks like the following:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: add-default-securitycontext
annotations:
policies.kyverno.io/title: Add Default securityContext
policies.kyverno.io/category: Sample
policies.kyverno.io/subject: Pod
policies.kyverno.io/minversion: 1.6.0
policies.kyverno.io/description: >-
A Pod securityContext entry defines fields such as the user and group which should be used to run the Pod.
Sometimes choosing default values for users rather than blocking is a better alternative to not impede such Pod definitions. This policy will mutate a Pod to set `runAsNonRoot`, `runAsUser`, `runAsGroup`, and `fsGroup` fields within the Pod securityContext if they are not already set.
spec:
rules:
- name: add-default-securitycontext
match:
any:
- resources:
kinds:
- Pod
mutate:
patchStrategicMerge:
spec:
securityContext:
+(runAsNonRoot): true
+(runAsUser): 1000
+(runAsGroup): 3000
+(fsGroup): 2000
The manifest above uses the ClusterPolicy custom resource definition (CRD) to patch pods via mutate.PatchStrategicMerge and a security context with a non-root user.
Secure Containers are Built, Not Assumed
While containers provide isolation and a reliable way to build and run applications, the ever-changing nature of container security means you can't rely solely on preventive measures. Instead, you should use a strategy that enables you to respond quickly to incidents as they happen.
That strategy starts with proactive scanning.
With Aikido, you can automatically scan container images to uncover vulnerable packages, outdated runtimes, or risky Dockerfile instructions before they ever reach production. Its open-source dependency scanner extends this protection by checking your libraries for exploitable CVEs and even silently patched vulnerabilities that attackers might use to gain escalated privileges.
These scans are reinforced by IaC and configuration checks that flag insecure Kubernetes or Docker settings, and by CI/CD security gates that enforce least-privilege policies at the pipeline level. And because prevention can never be perfect, Aikido Zen adds runtime protection to detect and respond to abnormal container behavior in real time.