Protecting the infrastructure (hosts)¶
Inasmuch as it's important to secure your container images, it's equally important to safeguard the infrastructure that runs them. This section explores different ways to mitigate risks from attacks launched directly against the host. These guidelines should be used in conjunction with those outlined in the Runtime Security section.
Use an OS optimized for running containers¶
Conside using Flatcar Linux, Project Atomic, RancherOS, and Bottlerocket (currently in preview), a special purpose OS from AWS designed for running Linux containers. It includes a reduced attack surface, a disk image that is verified on boot, and enforced permission boundaries using SELinux.
Treat your infrastructure as immutable and automate the replacement of your worker nodes¶
Rather than performing in-place upgrades, replace your workers when a new patch or update becomes available. This can be approached a couple of ways. You can either add instances to an existing autoscaling group using the latest AMI as you sequentially cordon and drain nodes until all of the nodes in the group have been replaced with the latest AMI. Alternatively, you can add instances to a new node group while you sequentially cordon and drain nodes from the old node group until all of the nodes have been replaced. EKS managed node groups uses the second approach and will present an option to upgrade workers when a new AMI becomes available.
eksctl also has a mechanism for creating node groups with the latest AMI and for gracefully cordoning and draining pods from nodes groups before the instances are terminated. If you decide to use a different method for replacing your worker nodes, it is strongly recommended that you automate the process to minimize human oversight as you will likely need to replace workers regularly as new updates/patches are released and when the control plane is upgraded.
With EKS Fargate, AWS will automatically update the underlying infrastructure as updates become available. Oftentimes this can be done seamlessly, but there may be times when an update will cause your task to be rescheduled. Hence, we recommend that you create deployments with multiple replicas when running your application as a Fargate pod.
Periodically run kube-bench to verify compliance with CIS benchmarks for Kubernetes¶
When running kube-bench against an EKS cluster, follow these instructions from Aqua Security, https://github.com/aquasecurity/kube-bench#running-in-an-eks-cluster.
False positives may appear in the report because of the way the EKS optimized AMI configures the kubelet. The issue is currently being tracked on GitHub.
Minimize access to worker nodes¶
Instead of enabling SSH access, use SSM Session Manager when you need to remote into a host. Unlike SSH keys which can be lost, copied, or shared, Session Manager allows you to control access to EC2 instances using IAM. Moreover, it provides an audit trail and log of the commands that were run on the instance.
At present, you cannot use custom AMIs with Managed Node Groups or modify the EC2 launch template for managed workers. This presents a "chicken and egg problem", i.e. how can you use the SSM agent to remotely access these instances without using SSH to install the SSM agent first? As a temporary stop-gap you can run a privileged DaemonSet to run a shell script that installs the SSM agent.
Since the DaemonSet is runs as a privileged pod, you should consider deleting it once the SSM agent is installed on your worker nodes. This workaround will no longer be necessary once Managed Node Groups adds support for custom AMIs and EC2 launch templates.
Deploy workers onto private subnets¶
By deploying workers onto private subnets, you minimize their exposure to the Internet where attacks often originate. Beginning April 22, 2020, the assignment of public IP addresses to nodes in a managed node groups will be controlled by the subnet they are deployed onto. Prior to this, nodes in a Managed Node Group were automatically assigned a public IP. If you choose to deploy your worker nodes on to public subnets, implement restrictive AWS security group rules to limit their exposure.
Run Amazon Inspector to assess hosts for exposure, vulnerabilities, and deviations from best practices¶
Inspector requires the deployment of an agent that continually monitors activity on the instance while using a set of rules to assess alignment with best practices.
At present, managed node groups do not allow you to supply user metadata or your own AMI. If you want to run Inspector on managed workers, you will need to install the agent after the node has been bootstrapped.
Inspector cannot be run on the infrastructure used to run Fargate pods.
Available on RHEL and CoreOS instances
SELinux provides an additional layer of security to keep containers isolated from each other and from the host. SELinux allows administrators to enforce mandatory access controls (MAC) for every user, application, process, and file. Think of it as a backstop that restricts access to specific resources on the operation based on a set of labels. On EKS it can be used to prevent containers from accessing each other's resources.
SELinux has a module called
container_t which general module used for running containers. When you configure SELinux for Docker, Docker automatically labels workloads
container_t as a type and gives each container a unique MCS level. This alone will effectively isolate containers from each another. If you need to give it more privileged access to a container, you can create your own profile in SElinux which grants it permissions to specific areas of the file system. This is similiar to PSPs in that you can create different profiles for different containers/pods. For example, you can have a profile for general workloads with a set of restrictive controls and another for things that require privileged access.
You can assign SELinux labels using pods or container security context as in the following
securityContext: seLinuxOptions: # Provide a unique MCS label per container # You can specify user, role, and type also # enforcement based on type and level (svert) level: s0:c144:c154
In this example
s0:c144:c154 corresponds to an MCS label assigned to a file that the container is allowed to access.
On EKS you could create policies that allow for privileged containers to run, like FluentD and create an SELinux policy to allow it to read from /var/log on the host.
SELinux will ignore containers where the type is unconfined.
- SELinux Kubernetes RBAC and Shipping Security Policies for On-prem Applications
- Iterative Hardening of Kubernetes
- Generate SELinux policies for containers with Udica describes a tool that looks at container spec files for Linux capabilities, ports, and mount points, and generates a set of SELinux rules that allow the container to run properly