Kubernetes securityContext


Over the last few months I have been trying to make the Kubernetes clusters I work with as secure as possible, with the goal of following standards and regulations, but also making infrastructure and workloads tough to hijack.

One of the simplest areas of focus has been the securityContext of Kubernetes pods and containers. Simple, but yet I have been surprised a lot recently to discover how many widely-used Helm Charts out there do not support proper configuration of this workload attribute. I have been trying to add proper configuration blocks to whatever open-source tools I have been using…

The trigger for this discovery has been the excellent kyverno-policy-reporter UI, which makes it stupid simple to identify such issues. Of course, it is far better for production-type workloads & environments to actually have admission webhooks and enforce what pods can be scheduled onto the cluster, but for figuring out how to secure said pods and for testing purposes a tool like this works fine.

Let me explain why securityContext matters, provide some additional reading material that is easy to digest and some good practices.

Pod securityContext

The securityContext block of a pod supports the following fields:

  • fsGroup : If set, this will apply to all containers of a pod and will change the ownership of the volumes to the GID defined here as value.

  • fsGroupChangePolicy : If set, this will change ownership of the volume before it gets exposed to the pod. It can’t apply to secrets, configmaps and emptyDir volumes. It supports two values OnRootMismatch and Always (default), which determine the behavior of the property. onRootMismatch means that the volume’s permissions will be changed only if they don’t already match the permissions of the container root.

  • runAsNonRoot : If set to true, will make the Kubelet check the runtime image to ensure it does not run as root (UID 0). If it does, then the container will fail to start.

  • runAsUser : Takes a UID as value, that defines the user to run the container process. If it is not set, then it defaults to the image user.

  • runAsGroup : Same as above, defines the GID of the container process.

  • seLinuxOptions : Sets the SELinux context for all containers. Supports attributes: level, role, type, user.

  • seccompProfile : Sets the Seccomp profile for all containers. Supports attributes: localhostProfile and type. The type field can be set to RuntimeDefault if you are unfamiliar with Seccomp but still want to get some of its benefits.

  • supplementalGroups : This field can take a list of groups that will be added to the container’s process’s GID. This way, one can give the process additional permissions.

  • sysctls : In this field, one can define kernel parameters, using the sysctl interface.

  • windowsOptions : Windows settings for all containers.

Good practices for podSecurityContext

securityContext:
  fsGroup: 10001 # Set to a non-0 GID
  runAsNonRoot: true
  runAsUser: 10001 # Set to a non-0 UID
  runAsGroup: 10001 # Set to a non-0 GID
  seccompProfile:
    type: RuntimeDefault

I usually avoid enabling SELinux on the containers, for the simple (and shameful) reason that I find it too much of a headache to configure and troubleshoot. Note that setting this requires that the host also supports SELinux.

You should check what user each image runs as (if any), to avoid any startup issues.

Container securityContext

The securityContext block of a container supports overwriting some of the properties set in the podSecurityContext, and provides additional security constraints for the container process. The supported fields are:

  • allowPrivilegeEscalation : This boolean defines whether a container process can gain more privileges than its parent process.

  • privileged : If set to true, it is equivalent to running as root.

  • capabilities : Defines POSIX capabilities to add or drop.

  • procMount : Defines the type of proc mount to use - by default is /proc will be readonly and masked.

  • readOnlyRootFilesystem : Defines whether the container’s root filesystem will be read-only.

  • runAsUser & runAsGroup : Same as described in podSecurityContext.

  • seLinuxOptions : Same as described in podSecurityContext.

  • seccompProfile : Same as described in podSecurityContext.

  • windowsOptions : Same as described in podSecurityContext.

Good practices for containerSecurityContext

securityContext:
  allowPrivilegeEscalation: false
  privileged: false
  capabilities:
    drop:
    - ALL
  readOnlyRootFilesystem: true

This block, in combination with the block above for prodSecurityContext - to set properties that were set there.

Better resources than this post

Here is some reading material: