- Elastic Load Balancing & Auto Scaling Groups
- Scalability: Ability of a system to handle an increase in load by adapting to the demand.
- High Availability: Ensures a system is operational and accessible for a high percentage of time, often achieved by reducing the impact of failures.
- There are two kinds of scalability:
- Vertical Scalability
- Horizontal Scalability (= elasticity)
- Scalability is linked but different to High Availability
- Increasing the capacity of a single instance (e.g., moving from t3.medium to t3.large).
- Suitable for databases or applications where upgrading a single resource is more efficient.
- Limited by hardware constraints (can only scale up to a certain point).
- Adding more instances (servers) to distribute the load across multiple resources.
- Achieved through technologies like Auto Scaling Groups (ASG) and Elastic Load Balancing (ELB).
- Preferred for applications needing resilience and distributed workloads.
- Horizontal scaling implies distributed systems.
- Implemented by deploying resources across multiple Availability Zones (AZs).
- Ensures failover and redundancy in case of failures in one AZ.
- High Availability usually goes hand in hand with horizontal scaling
- Vertical Scaling: Increase instance size (= scale up / down)
- From: t2.nano - 0.5G of RAM, 1 vCPU
- To: u-12tb1.metal – 12.3 TB of RAM, 448 vCPUs
- Horizontal Scaling: Increase number of instances (= scale out / in)
- Auto Scaling Group
- Load Balancer
- High Availability: Run instances for the same application across multi AZ
- Auto Scaling Group multi AZ
- Load Balancer multi AZ
Term | Definition |
---|---|
Scalability | Ability to increase or decrease the capacity to handle varying levels of traffic or load. |
Elasticity | Automatically adjusts resources up or down based on the load in real-time, preventing under or over-provisioning. |
Agility | The ability to deploy and manage resources quickly and efficiently in response to changing demands. |
- Distributes incoming traffic across multiple targets (EC2 instances, containers, IP addresses) to ensure that no single resource is overwhelmed.
- Ensures application fault tolerance and high availability by spreading the load across multiple servers.
- Protects against failures in a single resource by rerouting traffic automatically.
- Do regular health checks to your instances
- Provide SSL termination (HTTPS) for your websites
- Elastic Load Balancer (ELB) is a fully managed service that automatically distributes incoming application traffic across multiple targets in one or more Availability Zones.
- It improves fault tolerance, enhances performance, and scales according to demand.
- AWS guarantees that it will be working
- AWS takes care of upgrades, maintenance, high availability
- AWS provides only a few configuration knobs
- Application Load Balancer (ALB): For HTTP and HTTPS traffic, operates at Layer 7 (application level).
- Network Load Balancer (NLB): Handles high-performance traffic at Layer 4 (transport level).
- Classic Load Balancer: (slowly retiring) – Layer 4 & 7
- An Auto Scaling Group (ASG) ensures the right number of EC2 instances are running to handle the load.
- Automatically adjusts the number of instances based on metrics such as CPU utilization or custom-defined thresholds.
- Can span across multiple AZs to ensure high availability.
- In real-life, the load on your websites and application can change
- In the cloud, you can create and get rid of servers very quickly
- The goal of an Auto Scaling Group (ASG) is to:
- Scale out (add EC2 instances) to match an increased load
- Scale in (remove EC2 instances) to match a decreased load
- Ensure we have a minimum and a maximum number of machines running
- Automatically register new instances to a load balancer
- Replace unhealthy instances
- Cost Savings: only run at an optimal capacity (principle of the cloud)
- Manual Scaling: Adjusting the number of instances manually based on load prediction.
- Dynamic Scaling: Automatically adjusts the number of instances based on demand (e.g., CPU usage).
- Simple / Step Scaling
- When a CloudWatch alarm is triggered (example CPU > 70%), then add 2 units
- When a CloudWatch alarm is triggered (example CPU < 30%), then remove 1
- Target Tracking Scaling
- Example: I want the average ASG CPU to stay at around 40%
- Scheduled Scaling
- Anticipate a scaling based on known usage patterns
- Example: increase the min. capacity to 10 at 5 pm on Fridays
- Simple / Step Scaling
- Predictive Scaling: Uses machine learning to predict future traffic patterns and scales proactively.
- High Availability vs Scalability (vertical and horizontal) vs Elasticity vs Agility in the Cloud
- Elastic Load Balancers (ELB)
- Distribute traffic across backend EC2 instances, can be Multi-AZ
- Supports health checks
- 3 types: Application LB (HTTP – L7), Network LB (TCP – L4), Classic LB (old)
- Auto Scaling Groups (ASG)
- Implement Elasticity for your application, across multiple AZ
- Scale EC2 instances based on the demand on your system, replace unhealthy
- Integrated with the ELB