Workload Architecture

Use multiple Availability Zones

AWS always recommends that your workloads use two or more Availability Zones (AZs) for improved reliability. This is a best practice in the Reliability Pillar of the AWS Well-Architected Framework. Configuring your load balancers to use multiple Availability Zones (AZs) ensures that they can benefit from the isolation of partitions of AWS infrastructure.

By setting up your load balancers to use multiple AZs, you enhance fault tolerance. If one Availability Zone becomes unavailable or has less healthy targets than the DNS failover threshold (default = 1), the load balancer performs a DNS failover to automatically route traffic to healthy targets in other Availability Zones. This ensures that your application remains accessible even in the event of an issue in one of the AZs.

Best Practice

[ALB, NLB] Configure your load balancers to use at least two Availability Zones.

Equally important, all AZs in use by your ELB should have targets registered in them. This reduces the chance of impact during an AZ impairment.

Best Practice

[ALB, NLB] All target groups should have targets registered in all Availability Zones configured in the load balancer.

References and Further Information

Deploy the workload to multiple locations

Regions and Availability Zones

Using load balancer target group health thresholds to improve availability

Availability Zone Independence (AZI)

To improve the availability of your load balancer in scenarios where a specific Availability Zone (AZ) faces issues, it is important to shift traffic away from the impacted AZ. A successful zonal evacuation strategy requires Availability Zone Independence (AZI) which in turn requires cross-zone load balancing to be turned off.

Best Practice

[ALB, NLB] Turn off cross-zone load balancing to achieve Availability Zone Independence (AZI).

ELB - Cross-zone load balancing

Considerations

Turning off cross-zone load balancing can lead to uneven traffic distribution at the target level. To prevent this, ensure you have a an equal number of targets in each Availability Zone.
With cross-zone load balancing turned off, it's crucial to prepare for potential zonal evacuations by ensuring sufficient target capacity at the remaining AZ(s) to manage the traffic. This can be achieved using an Auto Scaling or a Static stability strategies. If you can't plan for enough capacity across all participating Availability Zones, the recommendation is that you keep cross-zone load balancing on.
Alternatively, you could keep cross-zone load balancing turned on, opting to fail away from an AZ at the load balancer level. In this case, implementing a robust health check strategy is recommended to ensure that traffic isn't directed to unhealthy targets in the affected zone.

References and Further Information

Availability Zone independence

Cross-zone load balancing

Elastic Load Balancing - Zonal Shift

ALB - Cross-zone load balancing for target groups

NLB - Cross-zone load balancing for target groups

Static Stability

Static stability is a system's design pattern focused on the resilience of a system, more specifically ensuring a system's readiness to withstand a partial failure. For load balancers and Availability Zones (AZ), this entails having enough target capacity to handle impairments in any single AZ. Essentially, this involves over provisioning targets in each AZ reducing reliance on scaling activities to maintain availability during AZ disruptions. When deciding how much to over provision, think of having one availability zone worth of extra capacity at any time spread across all zones, so when a zonal shift happens, the remaining zones will be able to handle the load of the lost zone.

Best Practice

[ALB, NLB] For high availability needs, consider adopting the static stability pattern.

Considerations

Using more Availability Zones (AZs) in your load balancer and targets can lead to a more efficient use of capacity under normal conditions. Take this scenario: To achieve Static Stability with 2 AZs, you must provision double (200%) the required capacity. This ensures that if one AZ is impaired, you still have the full 100% capacity you need. However, if you configure 3 AZs, you only need to provision 150% of the needed capacity, ensuring that losing one AZ still leaves you with the necessary 100%.

Note

Overprovisioning resources will raise the operational cost of the system. AWS users should weigh these costs against their workload's availability targets before deciding on this approach.

References and Further Information

Static stability using Availability Zones

AWS Fault Isolation Boundaries - Static stability

Use static stability to prevent bimodal behavior

AWS re:Invent 2023 - Enhance your app’s security & availability with Elastic Load Balancing

Use AWS Global Accelerator for workloads deployed in multiple regions.

AWS Global Accelerator is a networking service that delivers traffic from clients via the AWS global network to your Application Load Balancer or Network Load Balancer. By using Anycast IP addresses, clients are routed to the nearest AWS edge location and traffic is delivered across the AWS backbone network; avoiding congested internet links and providing lower latency with less variation. You can configure multiple regions as destinations at the same time, ensuring that users reach your workloads with the lowest possible latency while providing high availability and resiliency. The AWS Global Accelerator continuously monitors the health of your application endpoints and automatically re-routes traffic to healthy endpoints in case of failures. This ensures high availability for your applications and reduces downtime.

AWS Global Accelerator is beneficial if your application is deployed across AWS Regions and when you are using more than one Load Balancer for redundancy.

Global Accerarator and ELBs

Image: AWS Global Accelerator directing traffic towards applications with redundancy. Each serviced by an Application Load Balancer, located in three separate AWS Regions.

Best Practice

[ALB, NLB] For multi-region deployments, consider using AWS Global Accelerator with your load balancer.

References and Further Information

AWS Global Accelerator

Improving availability and performance for Application Load Balancers using one-click integration with AWS Global Accelerator

Add an accelerator when you create a load balancer

Isolate applications

Hosting multiple workloads in a single load balancer can amplify the blast radius of configuration and scaling issues. It can also increase complexity of compliance and change management processes. This is especially true when these workloads are managed by different teams, have different availability goals or have different risk profiles.

Best Practice

[ALB, NLB] Avoid using a single load balancer for multiple workloads.

Considerations

Keep in mind that each load balancer incurs an hourly charge, and managing several of them can also raise your overall operational costs.

References and Further Information

Guidance for Workload Isolation on AWS

Organizing Your AWS Environment Using Multiple Accounts