Vendor: containerd Project
Vendor URL: https://containerd.io/
Versions affected: 1.3.x, 1.2.x, 1.4.x, others likely
Systems Affected: Linux
Author: Jeff Dileo
CVE Identifier: CVE-2020-15257
Advisory URL: https://github.com/containerd/containerd/security/advisories/GHSA-36xw-fx78-c5r4
Risk: High (full root container escape for a common container configuration)

containerd is a container runtime underpinning Docker and common Kubernetes
configurations. It handles abstractions related to containerization and
provides APIs to manage container lifecycles. containerd-shim is a binary
spawned by containerd that serves as the parent of a container and which
implements container lifecycle and reconnection logic that it exposes to
containerd through the containerd shim API. This API is exposed over an
abstract namespace Unix domain socket that is accessible from the root network
namespace. Due to this, non-user namespaced containers with host networking
can access this API and cause containerd-shim to perform dangerous actions and
spin up arbitrarily privileged containers, enabling container escapes and
escalation to full root privileges on the host.

  • containerd/containerd
    • runtime/v1/shim/client/client.go: WithStart(), newCommand()
    • cmd/containerd-shim/main_unix.go: serve()
    • cmd/containerd-shim/shim_linux.go: newServer()
  • containerd/ttrpc (via vendor/github.com/containerd/ttrpc/unixcreds_linux.go)
    • unixcreds_linux.go: UnixSocketRequireSameUser()

An attacker that is able to run or compromise a host network container running
as UID 0 can escape the container, escalate privileges, and compromise the host.

containerd is a core container runtime, which manages runc-based containers,
and is used by Docker (from which it was spun out of) and Kubernetes, either
through Docker or directly through the containerd CRI shim. Generally,
containerd exists as a long-running service daemon that exposes gRPC APIs
(e.g. those for containers and tasks) for container lifecycle management operations (e.g. container
execution and supervision, image handling, etc.). To implement its APIs,
containerd does not directly parent the containers that it creates and
oversees on behalf of its clients. Instead, containerd spawns containerd-shim
processes that manage the lifecycle of each container. containerd-shim stays
alive for the course of the container’s life to manage it and directly invokes
the runc binary to directly spawn and run the container itself.

To serve its own gRPC (actually ttrpc, an embedded gRPC implementation and
wire protocol) APIs (e.g. v1 and v2), containerd-shim listens on an abstract Unix
domain socket. These are Linux-specific Unix domain sockets that use
length-prefixed keys that begin with a null byte and may contain arbitrary
binary sequences. These containerd-shim sockets take different forms across
different containerd versions; however, a common behavior is that they embed a
trailing null byte in the abstract Unix domain socket sun_path key, which
prevents a number of common Unix tools (e.g. socat) from connecting to it.

  • @/containerd-shim///shim.sock
  • @/containerd-shim/.sock

While containerd-shim is more than capable of binding and listening on such a
socket itself when passed the --socket CLI flag, it also supports receiving
an arbitrary socket file descriptor from its parent process. containerd uses
this approach and pre-creates and listen(2)s on the abstract Unix domain socket
before the containerd-shim child process is created to that it may be
initialized with a handle to it. containerd-shim then starts its containerd
shim API ttrpc server on the socket. As abstract Unix domain sockets are
otherwise permissionless, containerd-shim uses standard Unix domain socket
features to validate that incoming connections have the same UID and EUID
(effective UID) as the containerd-shim process itself (typically UID:0 and
EUID:0, root).

However, unlike normal Unix domain sockets, which are bound to file paths,
abstract Unix domain sockets are tied to the network namespace of a process.
As a result, containers that use host networking
(e.g. docker run --host network alpine ...) will be able to access it.
Furthermore, while most containerization platforms run their containers with
a minimal set of Linux capabilities (the constituent privileges of root), they
also do not run the containers in user namespaces, resulting in containers
that run as a privileged dropped root user. Due to this, such containers run
by default with a host user namespace UID and EUID of 0. This combination
enables such containers to enumerate containerd-shim sockets (e.g. via
netstat -xl or /proc/net/unix) and successfully connect to them.

containerd-shim exposes a number of dangerous APIs that can be used to
escape a container and execute privileged commands. Across the two main
versions of containerd(-shim) in use, 1.2.x and 1.3.x, the following
exploit primitives are exposed to users, among others:

  • Arbitrary file reads
  • Arbitrary file appends
  • Arbitrary file writes
  • Arbitrary command execution in the context of containerd-shim (root)
  • Creating a container from a runc config.json file
  • Starting a created container

As a result, it is trivial for an attacker to compromise the host if they
can reach the containerd shim API.

Abstract namespace Unix domain sockets should not be used to
communicate with containerd-shim. Instead, the connection should be performed
over unnamed Unix domain sockets created with socketpair(2), or Unix domain
sockets bound to a file path, like /run/containerd/containerd.sock and
/run/containerd/containerd.sock.ttrpc. If this is not feasible, stricter
access control checks would need to be performed to validate incoming
shim API clients, and it may be necessary to modify the connection handshake to
provide additional authentication data and/or identification. It should be
noted that it is insufficient to check that the connecting process is not a
child of containerd-shim itself as the process could still connect to the
shim API of a different container’s containerd-shim.

For users running container workloads on vulnerable systems, this issue may
be mitigated by disallowing host networking from any containers that are not
user namespaced, or by ensuring that such containers are run with a non-zero
UID/GID.

Users should update to the newest versions of containerd that include patches
for this issue. Additionally, as any running containers created prior to
updating containerd to a fixed version will remain vulnerable after the update,
users will need to ensure that all containers are fully stopped and then
restarted after the update is completed.

For users who are uncertain about whether CVE-2020-15257 affects them, the
below command can be used to quickly determine if a container created by a
vulnerable version of containerd is still running. If any results are
returned, a vulnerable containerd-shim process is running.

$ cat /proc/net/unix | grep 'containerd-shim' | grep '@'
6/03/20 - NCC Group emailed the security email of the containerd project
          ([email protected]) asking for a means of secure
          communication to disclose vulnerability information
6/03/20 - NCC Group disclosed vulnerability to the containerd project along
          with exploit code targeting containerd 1.2.x and 1.3.x
6/04-05/20 - After some initial conversation over email about possible
             remediations, communication migrated to GitHub.
6/05/20 - NCC Group discussed the (in)feasibility of relying on
          AppArmor/SELinux to remediate this issue.
6/12/20 - NCC Group requests an update.
6/15/20 - Issue is not accepted as a security vulnerability in containerd.
          The containerd project indicates that while a fix will be applied, it
          will not be backported to in-use branches. A sample patch is shared
          with NCC Group.
6/15-16/20 - Further replies and conversation occurred about the aforementioned
             patch's implementation and its incompatibility with prior versions
             of containerd. NCC Group provided information on an alternate
             approach that could work for all versions.
6/19-24/20 - Further development of a patch occurs by a containerd maintainer
             who requests and receives permission to make a public pull
             request. The implementation follows NCC Group's original
             recommendation and would be compatible across containerd versions.
7/10/20 - NCC Group requests an update and an estimate on when the fix will be
          merged and applied to older containerd branches.
7/13/20 - A containerd maintainer replies stating that the upcoming 1.4.0
          release will forgo having the fix applied, and that instead, it will
          be be applied as a fix in 1.4.1 and to at least the 1.3.x branch.
9/04/20 - After a lack of updates, NCC Group states an intention to publish a
          technical advisory for this issue, and asks if anyone can confirm if
          the fix has been applied/backported as the standing pull request was
          commented as having been pushed to the future 1.5.x release. NCC
          Group also asks for a timeline on when the issue will be fixed and
          states that they can wait up to 30 days (10/05/20) or until a fix is
          released to publish the advisory since the issue was not accepted as
          a vulnerability.
9/10/20 - A containerd maintainer replies stating that the issue is still not
          fixed and that the pull request is not likely to be merged soon. They
          ask for reconsideration of the backwards-incompatible fix.
9/10/20 - NCC Group replies with concerns about the approach of the
          backwards-incompatible fix, including a timing side channel in the
          implementation that would enable guessing the authentication secret,
          and a bias in the PRNG used to create it.
10/02/20 - A maintainer replies with a potential fix based on verifying that the
           PID of the connecting process is on the host mount namespace.
           Immediately after this, a containerd security advisor asks if NCC
           Group still plans to publish a technical advisory on 10/05/20 and if
           they would be open to having a conversation about the issue.
10/02/20 - NCC Group replies raising a concern over a possible race condition
           in the underlying mechanism of potential fix. NCC Group also states
           that they can postpone publishing the advisory, and would be happy
           to converse about the issue if it would help to have it fixed. Over
           email, meeting availability is exchanged.
10/06/20 - NCC Group, a containerd security advisor and two containerd
           maintainers discuss the issue in a call and agree on a plan to
           remediate the issue as a vulnerability, with patches applied to
           supported branches of containerd.
10/06/20
-11/04/20 - The containerd project works on implementing the fixes
                    across several supported protocol versions, backports the
                    patches to the 1.4.x and 1.3.x branches.
10/16/20 - CVE-2020-15257 is issued for this vulnerability.
11/10-13/20 - NCC Group reviews and tests the patches, and provides feedback
              on the changes; no major issues are identified. Subsequent
              discussion resolves questions raised in the feedback.
11/13/20 - A follow-up call occurs to discuss disclosure timelines, patch
           releases, and embargo dates.
11/13-30/20 - Patches are provided under embargo to vendors and Linux
              distributions.
11/19-25/20 - A containerd security maintainer backports the patches to the
              end-of-life containerd 1.2.x for Linux distributions using that
              version. After discussion and analysis, a backport based on
              similar patches provided by Canonical and Google is selected for
              merging into the 1.2.x branch.
11/30/20 - containerd publishes a security advisory for this issue,
           CVE-2020-15257.
11/30/20 - NCC Group publishes this security advisory following the containerd
           publication.

Michael Crosby, Samuel Karp, and Derek McGowan of the containerd project.

NCC Group is a global expert in cyber security and risk mitigation,
working with businesses to protect their brand, value and reputation
against the ever-evolving threat landscape.

With our knowledge, experience and global footprint, we are best
placed to help businesses identify, assess, mitigate and respond to
the risks they face.

We are passionate about making the Internet safer and revolutionizing
the way in which organizations think about cybersecurity.