Vendor: containerd Project
Vendor URL: https://containerd.io/
Versions affected: 1.3.x, 1.2.x, 1.4.x, others likely
Systems Affected: Linux
Author: Jeff Dileo
CVE Identifier: CVE-2020-15257
Advisory URL: https://github.com/containerd/containerd/security/advisories/GHSA-36xw-fx78-c5r4
Risk: High (full root container escape for a common container configuration)
containerd is a container runtime underpinning Docker and common Kubernetes
configurations. It handles abstractions related to containerization and
provides APIs to manage container lifecycles. containerd-shim is a binary
spawned by containerd that serves as the parent of a container and which
implements container lifecycle and reconnection logic that it exposes to
containerd through the containerd shim API. This API is exposed over an
abstract namespace Unix domain socket that is accessible from the root network
namespace. Due to this, non-user namespaced containers with host networking
can access this API and cause containerd-shim to perform dangerous actions and
spin up arbitrarily privileged containers, enabling container escapes and
escalation to full root privileges on the host.
- containerd/containerd
- runtime/v1/shim/client/client.go:
WithStart()
,newCommand()
- cmd/containerd-shim/main_unix.go:
serve()
- cmd/containerd-shim/shim_linux.go:
newServer()
- runtime/v1/shim/client/client.go:
- containerd/ttrpc (via vendor/github.com/containerd/ttrpc/unixcreds_linux.go)
- unixcreds_linux.go:
UnixSocketRequireSameUser()
- unixcreds_linux.go:
An attacker that is able to run or compromise a host network container running
as UID 0 can escape the container, escalate privileges, and compromise the host.
containerd is a core container runtime, which manages runc-based containers,
and is used by Docker (from which it was spun out of) and Kubernetes, either
through Docker or directly through the containerd CRI shim. Generally,
containerd exists as a long-running service daemon that exposes gRPC APIs
(e.g. those for containers and tasks) for container lifecycle management operations (e.g. container
execution and supervision, image handling, etc.). To implement its APIs,
containerd does not directly parent the containers that it creates and
oversees on behalf of its clients. Instead, containerd spawns containerd-shim
processes that manage the lifecycle of each container. containerd-shim stays
alive for the course of the container’s life to manage it and directly invokes
the runc binary to directly spawn and run the container itself.
To serve its own gRPC (actually ttrpc
, an embedded gRPC implementation and
wire protocol) APIs (e.g. v1 and v2), containerd-shim listens on an abstract Unix
domain socket. These are Linux-specific Unix domain sockets that use
length-prefixed keys that begin with a null byte and may contain arbitrary
binary sequences. These containerd-shim sockets take different forms across
different containerd versions; however, a common behavior is that they embed a
trailing null byte in the abstract Unix domain socket sun_path key, which
prevents a number of common Unix tools (e.g. socat) from connecting to it.
@/containerd-shim///shim.sock
@/containerd-shim/.sock
While containerd-shim is more than capable of binding and listening on such a
socket itself when passed the --socket
CLI flag, it also supports receiving
an arbitrary socket file descriptor from its parent process. containerd uses
this approach and pre-creates and listen(2)s on the abstract Unix domain socket
before the containerd-shim child process is created to that it may be
initialized with a handle to it. containerd-shim then starts its containerd
shim API ttrpc server on the socket. As abstract Unix domain sockets are
otherwise permissionless, containerd-shim uses standard Unix domain socket
features to validate that incoming connections have the same UID and EUID
(effective UID) as the containerd-shim process itself (typically UID:0 and
EUID:0, root).
However, unlike normal Unix domain sockets, which are bound to file paths,
abstract Unix domain sockets are tied to the network namespace of a process.
As a result, containers that use host networking
(e.g. docker run --host network alpine ...
) will be able to access it.
Furthermore, while most containerization platforms run their containers with
a minimal set of Linux capabilities (the constituent privileges of root), they
also do not run the containers in user namespaces, resulting in containers
that run as a privileged dropped root user. Due to this, such containers run
by default with a host user namespace UID and EUID of 0. This combination
enables such containers to enumerate containerd-shim sockets (e.g. via
netstat -xl
or /proc/net/unix) and successfully connect to them.
containerd-shim exposes a number of dangerous APIs that can be used to
escape a container and execute privileged commands. Across the two main
versions of containerd(-shim) in use, 1.2.x and 1.3.x, the following
exploit primitives are exposed to users, among others:
- Arbitrary file reads
- Arbitrary file appends
- Arbitrary file writes
- Arbitrary command execution in the context of containerd-shim (root)
- Creating a container from a runc config.json file
- Starting a created container
As a result, it is trivial for an attacker to compromise the host if they
can reach the containerd shim API.
Abstract namespace Unix domain sockets should not be used to
communicate with containerd-shim. Instead, the connection should be performed
over unnamed Unix domain sockets created with socketpair(2), or Unix domain
sockets bound to a file path, like /run/containerd/containerd.sock and
/run/containerd/containerd.sock.ttrpc. If this is not feasible, stricter
access control checks would need to be performed to validate incoming
shim API clients, and it may be necessary to modify the connection handshake to
provide additional authentication data and/or identification. It should be
noted that it is insufficient to check that the connecting process is not a
child of containerd-shim itself as the process could still connect to the
shim API of a different container’s containerd-shim.
For users running container workloads on vulnerable systems, this issue may
be mitigated by disallowing host networking from any containers that are not
user namespaced, or by ensuring that such containers are run with a non-zero
UID/GID.
Users should update to the newest versions of containerd that include patches
for this issue. Additionally, as any running containers created prior to
updating containerd to a fixed version will remain vulnerable after the update,
users will need to ensure that all containers are fully stopped and then
restarted after the update is completed.
For users who are uncertain about whether CVE-2020-15257 affects them, the
below command can be used to quickly determine if a container created by a
vulnerable version of containerd is still running. If any results are
returned, a vulnerable containerd-shim process is running.
$ cat /proc/net/unix | grep 'containerd-shim' | grep '@'
6/03/20 - NCC Group emailed the security email of the containerd project
([email protected]) asking for a means of secure
communication to disclose vulnerability information
6/03/20 - NCC Group disclosed vulnerability to the containerd project along
with exploit code targeting containerd 1.2.x and 1.3.x
6/04-05/20 - After some initial conversation over email about possible
remediations, communication migrated to GitHub.
6/05/20 - NCC Group discussed the (in)feasibility of relying on
AppArmor/SELinux to remediate this issue.
6/12/20 - NCC Group requests an update.
6/15/20 - Issue is not accepted as a security vulnerability in containerd.
The containerd project indicates that while a fix will be applied, it
will not be backported to in-use branches. A sample patch is shared
with NCC Group.
6/15-16/20 - Further replies and conversation occurred about the aforementioned
patch's implementation and its incompatibility with prior versions
of containerd. NCC Group provided information on an alternate
approach that could work for all versions.
6/19-24/20 - Further development of a patch occurs by a containerd maintainer
who requests and receives permission to make a public pull
request. The implementation follows NCC Group's original
recommendation and would be compatible across containerd versions.
7/10/20 - NCC Group requests an update and an estimate on when the fix will be
merged and applied to older containerd branches.
7/13/20 - A containerd maintainer replies stating that the upcoming 1.4.0
release will forgo having the fix applied, and that instead, it will
be be applied as a fix in 1.4.1 and to at least the 1.3.x branch.
9/04/20 - After a lack of updates, NCC Group states an intention to publish a
technical advisory for this issue, and asks if anyone can confirm if
the fix has been applied/backported as the standing pull request was
commented as having been pushed to the future 1.5.x release. NCC
Group also asks for a timeline on when the issue will be fixed and
states that they can wait up to 30 days (10/05/20) or until a fix is
released to publish the advisory since the issue was not accepted as
a vulnerability.
9/10/20 - A containerd maintainer replies stating that the issue is still not
fixed and that the pull request is not likely to be merged soon. They
ask for reconsideration of the backwards-incompatible fix.
9/10/20 - NCC Group replies with concerns about the approach of the
backwards-incompatible fix, including a timing side channel in the
implementation that would enable guessing the authentication secret,
and a bias in the PRNG used to create it.
10/02/20 - A maintainer replies with a potential fix based on verifying that the
PID of the connecting process is on the host mount namespace.
Immediately after this, a containerd security advisor asks if NCC
Group still plans to publish a technical advisory on 10/05/20 and if
they would be open to having a conversation about the issue.
10/02/20 - NCC Group replies raising a concern over a possible race condition
in the underlying mechanism of potential fix. NCC Group also states
that they can postpone publishing the advisory, and would be happy
to converse about the issue if it would help to have it fixed. Over
email, meeting availability is exchanged.
10/06/20 - NCC Group, a containerd security advisor and two containerd
maintainers discuss the issue in a call and agree on a plan to
remediate the issue as a vulnerability, with patches applied to
supported branches of containerd.
10/06/20
-11/04/20 - The containerd project works on implementing the fixes
across several supported protocol versions, backports the
patches to the 1.4.x and 1.3.x branches.
10/16/20 - CVE-2020-15257 is issued for this vulnerability.
11/10-13/20 - NCC Group reviews and tests the patches, and provides feedback
on the changes; no major issues are identified. Subsequent
discussion resolves questions raised in the feedback.
11/13/20 - A follow-up call occurs to discuss disclosure timelines, patch
releases, and embargo dates.
11/13-30/20 - Patches are provided under embargo to vendors and Linux
distributions.
11/19-25/20 - A containerd security maintainer backports the patches to the
end-of-life containerd 1.2.x for Linux distributions using that
version. After discussion and analysis, a backport based on
similar patches provided by Canonical and Google is selected for
merging into the 1.2.x branch.
11/30/20 - containerd publishes a security advisory for this issue,
CVE-2020-15257.
11/30/20 - NCC Group publishes this security advisory following the containerd
publication.
Michael Crosby, Samuel Karp, and Derek McGowan of the containerd project.
NCC Group is a global expert in cyber security and risk mitigation,
working with businesses to protect their brand, value and reputation
against the ever-evolving threat landscape.
With our knowledge, experience and global footprint, we are best
placed to help businesses identify, assess, mitigate and respond to
the risks they face.
We are passionate about making the Internet safer and revolutionizing
the way in which organizations think about cybersecurity.