Mutating Webhook Latency |
Increased webhook latency leads to an increase in the K8s API Server latency (delay in pod creation for example, if webhook is setup for pod creation). Pod creation latency in turn is propogated to the K8s job controller whose workers are now experiencing delays in creating jobs and leads to growing Job Worker Queue depth. Larger queue depth leads to a lower thoughput in the number of concurrent EMR on EKS jobs |
EMR on EKS Job driver retries |
Driver retries create an extra K8s Job object which essentially doubles the amount of K8s Job objects in etcd database. This leads to increased strain in etcd database and also database size to grow faster and hence leads to increase in etcd request latency. This in turn results in a lower throughput in the number of concurrent EMR on EKS jobs. |
EMR on EKS Job Start Timeout Setting |
When the K8s job controller work queue depth is larger, that means there could be a delay in the actual Spark driver pod to get created. In the meantime EMR on EKS's control plane by default expects the EMR EKS job driver pod to be created within 15 mins. If the driver is not created within that timeout period, the EMR on EKS control plane will mark the job as failed preemptively. Higher timeout values will ensure the job gets longer time for getting the job scheduled and begin running |
EMR on EKS Job run time |
A longer EMR on EKS job run time means that we will essentially have more concurrent active jobs in the EKS cluster. If we keep a consistent job submission rate for long running EMR EKS jobs as compared to a job with a shorter duration we will end up with a larger amount of active concurrent jobs in the EKS cluster. This can lead to more objects in etcd db, increasing etcd db size, increased etcd request latency, and lower EMR job submission throughput. |
Number of executors in EMR on EKS Job |
As we define higher number of executors per single EMR on EKS job, number of objects created on EKS cluster grow such as pods, config maps, events etc. This becomes a limiting factor for querying etcd database eventually causing EKS cluster performance degradation. |
EMR on EK Job submission Rate |
The EMR on EKS job submission rate will dictate the load placed on the API Server. A larger EMR EKS job submission rate can cause the more k8s objects in etcd db, increasing etcd db size, increased etcd request and api server request latency, and lower EMR job submission throughput. |
EMR Image pull policy |
Default Image pull policy for job submitter, driver, and executor pods is set as Always. This adds latency in pod creation times. Unless specifically required for customer usecase, we can set Image pull policy to IfNotPresent resulting in lesser pod creation times. |
EMR Job type |
Job concurrency values are different for batch and streaming job types. Streaming jobs usually consume less resources resulting in higher job concurrency values compared to batch jobs. |
EKS K8s control plane scaling |
EKS will autoscale the K8s control plane including API server and etcd db instances for customer's EKS cluster based on resource consumption. To be able to successfully run larger amounts of concurrent EMR on EKS jobs on EKS the API Server needs to be scaled up to handle the extra load. However if factors like webhook latency impact the metrics needed by the EKS API Server autoscaler are inhibited this can lead to not properly scaling up. This will impact the health of the API Server and etcd db and lead to a lower throughput on successfully completed jobs. |
EKS etcd db size |
As you submit more concurrent running EMR on EKS jobs, the number of k8s objects stored in etcd db grow and in turn increase etcd db size as well. Increased etcd db size causes lantecy in some api server requests requiring cluster-wide/namespace-wide etcd read calls and will reduce EMR job submission throughput. Upper bound on etcd db size is 8GB as specified by EKS and reaching this capacity can make EKS cluster in read-only mode. Customers should monitor and keep their etcd db size within limits. We recommend keeping it below 7GB. |
EKS VPC Subnets IP pool |
Available IP addresses in VPC subnets that are configured for EKS cluster also impact the EMR on EKS job throughput. Each pod needs to be assigned an IP address, thus it is essential to have large enough IP address pool available in VPC subnets of the EKS cluster to achieve higher pod concurrency. Exhaustion of IP adresses causes new pod creation requests to fail. |
EKS Cluster version |
EKS has made improvements to cluster versions higher than 1.28 resulting in higher job throughput for EMR on EKS jobs. These recommendations are based on using EKS cluster version 1.30. |
Pod template size |
Having high pod template sizes, for example from a high number of sidecar containers, init containers, or numerous environment variables, results in increased usage of the etcd database. This increased usage can potentially limit the number of pods that can be stored in etcd and may impact the cluster's overall performance, including the rate at which EMR jobs can be submitted. |