Auto Scaling Policies: Target Tracking vs Step vs Scheduled - When to Use What?
Target Tracking auto-maintains target metrics, Step Scaling responds by severity, Scheduled Scaling handles predictable patterns. Master Auto Scaling policy selection for SAA-C03.
Related Exam Domains
- Domain 2: Design Resilient Architectures
- Domain 3: Design High-Performing Architectures
Key Takeaway
Use Target Tracking for most cases. Set a target metric (e.g., CPU 50%) and it automatically adjusts instance count. Step Scaling for severity-based responses, Scheduled Scaling for predictable patterns, Predictive Scaling for ML-based pre-scaling.
Exam Tip
Exam Essential: Target Tracking = Recommended + automatic, Step = Staged + CloudWatch alarms, Scheduled = Reserved + known patterns, Predictive = ML + pre-scaling
What is an Auto Scaling Group?
An Auto Scaling Group (ASG) is a collection of EC2 instances with identical configuration that automatically adjusts instance count based on traffic.
Auto Scaling Group Structure:
┌─────────────────────────────────────┐
│ Auto Scaling Group │
│ │
│ Min: 2 │ Desired: 4 │ Max: 10 │
│ │
│ ┌───┐ ┌───┐ ┌───┐ ┌───┐ │
│ │EC2│ │EC2│ │EC2│ │EC2│ ← Current: 4 │
│ └───┘ └───┘ └───┘ └───┘ │
└─────────────────────────────────────┘
↑
Scaling policy determines instance count
Key ASG Settings
| Setting | Description |
|---|---|
| Minimum Capacity (Min) | Minimum instances always maintained |
| Desired Capacity | Current target instance count |
| Maximum Capacity (Max) | Maximum instances allowed |
Auto Scaling Policy Types at a Glance
| Policy Type | How It Works | Best For | Complexity |
|---|---|---|---|
| Target Tracking | Auto-maintains target metric | Most cases (recommended) | Low |
| Step Scaling | Staged response by alarm severity | Fine-grained control | Medium |
| Simple Scaling | Single adjustment + cooldown | Simple requirements | Low |
| Scheduled Scaling | Adjusts at scheduled times | Predictable patterns | Low |
| Predictive Scaling | ML-based pre-scaling | Cyclical traffic patterns | Low |
Target Tracking Scaling
How It Works
Target Tracking works like a thermostat. Set a target metric (e.g., CPU 50%) and Auto Scaling automatically adjusts instance count to maintain that value.
Target Tracking Example:
Target: CPU utilization 50%
Current CPU 70% → Add instances (scale out)
Current CPU 30% → Remove instances (scale in)
Current CPU 50% → Maintain
Supported Metrics
Predefined Metrics:
ASGAverageCPUUtilization- Average CPU utilizationASGAverageNetworkIn- Average network inASGAverageNetworkOut- Average network outALBRequestCountPerTarget- ALB requests per target
Custom Metrics:
- Application-specific CloudWatch metrics supported
Target Tracking Advantages
- Simple setup: Just specify target value
- Auto CloudWatch alarms: Alarms auto-created/managed
- Conservative scale-in: Prioritizes availability
- Adaptive adjustment: Auto-adapts to traffic patterns
Exam Tip
Exam Point: Target Tracking is conservative during scale-in. Removes instances gradually to prioritize availability during traffic fluctuations.
When Target Tracking is Suitable
- Most web applications
- CPU/network-based workloads
- Applications behind ALB
- Need simple, effective scaling
Step Scaling
How It Works
Step Scaling performs different adjustments based on alarm breach severity. Create CloudWatch alarms first, then define staged adjustments by alarm severity.
Step Scaling Example:
CloudWatch Alarm: CPU utilization
CPU 50-60% → Add 2 instances
CPU 60-75% → Add 4 instances
CPU 75-90% → Add 6 instances
CPU > 90% → Add 10 instances
Step Scaling vs Target Tracking
| Aspect | Target Tracking | Step Scaling |
|---|---|---|
| Alarm Management | Auto | Manual |
| Adjustment Size | Auto-calculated | Staged specification |
| Scale-in | Conservative (gradual) | Immediate |
| Setup Complexity | Low | Medium |
| Fine Control | Limited | Available |
When Step Scaling is Suitable
- Different responses needed based on load magnitude
- Need fast scale-in
- Require fine-grained adjustment control
- Mixed with Target Tracking (advanced)
Simple Scaling
How It Works
Simple Scaling makes a fixed adjustment when alarm triggers, then blocks additional adjustments during cooldown period.
Simple Scaling Flow:
Alarm triggers → Add 2 instances → Cooldown (300 sec) → Can scale again
↑
Scaling blocked during this period
Simple Scaling Limitations
- Cannot respond to traffic spikes during cooldown
- Less flexible than Step Scaling
- AWS recommends Step Scaling or Target Tracking
Exam Tip
Exam Point: Simple Scaling has a cooldown period preventing consecutive adjustments. Use Step Scaling for faster response.
Scheduled Scaling
How It Works
Scheduled Scaling adjusts capacity at predetermined times based on predictable traffic patterns.
Scheduled Scaling Example:
Mon-Fri:
08:00 → Increase min capacity to 10 (work start)
20:00 → Decrease min capacity to 2 (work end)
Sat-Sun:
All day → Maintain min capacity at 2
When Scheduled Scaling is Suitable
- Fixed business hour patterns
- Planned marketing events
- Batch processing time slots
- Known peak traffic times
Configuration Example
Schedule: Daily 8:00 AM
Min: 5, Max: 20, Desired: 10
Schedule: Daily 8:00 PM
Min: 2, Max: 10, Desired: 2
Predictive Scaling
How It Works
Predictive Scaling uses machine learning to analyze historical traffic patterns and predict future capacity to pre-scale.
Predictive Scaling Flow:
Analyze past 2 weeks of data
↓
Learn daily/weekly patterns
↓
Predict demand for next 2 days
↓
Pre-scale before traffic increase
Predictive Scaling Features
- Historical data required: Minimum 24 hours (recommended: 14 days)
- Prediction cycle: Updated daily, predicts 2 days ahead
- Combine with dynamic scaling: Can use with Target Tracking
When Predictive Scaling is Suitable
- Cyclical traffic patterns (business hours, weekend patterns)
- Repetitive on/off workloads
- Applications with long initialization time
- Need to minimize latency through pre-scaling
Exam Tip
Exam Point: Predictive Scaling is not suitable for new applications. Requires minimum 24 hours of historical data; insufficient data reduces prediction accuracy.
Combining Scaling Policies
Recommended Combinations
1. Target Tracking + Predictive Scaling
Predictive: Pre-scale for predicted traffic
Target Tracking: Handle actual load that differs from prediction
2. Target Tracking + Scheduled Scaling
Scheduled: Increase capacity for known events
Target Tracking: Additional adjustment based on actual load during event
3. Target Tracking (scale-out) + Step Scaling (scale-in)
Target Tracking: Smooth expansion on load increase
Step Scaling: Fast reduction on load decrease
Multiple Policy Conflict Behavior
Scale-out: Policy providing largest capacity wins
Scale-in: Policy maintaining most instances wins
→ Always prioritizes availability
Scenario-Based Policy Selection
Scenario 1: General Web Application
Requirement: CPU-based auto scaling
Recommended: Target Tracking (CPU 50%)
Reason: Simple and effective, sufficient for most cases
Scenario 2: E-commerce Black Friday
Requirement: Massive traffic spike for scheduled event
Recommended: Scheduled Scaling + Target Tracking
Reason: Pre-scale before event + respond to actual load
Scenario 3: Call Center Application
Requirement: Business hours (9-6) traffic pattern
Recommended: Predictive Scaling + Target Tracking
Reason: ML learns pattern + handles exceptions
Scenario 4: Game Server (Rapid Load Changes)
Requirement: Fast response based on load magnitude
Recommended: Step Scaling
Reason: Different adjustment sizes by stage possible
Scenario 5: Batch Processing
Requirement: Large-scale processing daily at 2 AM
Recommended: Scheduled Scaling
Reason: Scale up before processing starts, scale down after completion
ASG Health Checks
Auto Scaling automatically replaces unhealthy instances.
Health Check Types
| Type | Checks | When to Use |
|---|---|---|
| EC2 | EC2 instance status | Default |
| ELB | Load balancer health check | Recommended with ELB |
| Custom | External health check system | Advanced scenarios |
Exam Tip
Exam Point: When using with ELB, enable ELB health check. EC2 health check alone cannot detect application-level issues.
Cost Optimization Tips
1. Set Appropriate Minimum Capacity
- Minimum instances to handle normal load
- Too high = wasted cost, too low = startup delay
2. Mix Spot Instances
- Mixed instance policy combines Spot + On-Demand
- Balance cost savings and availability
3. Set Instance Warmup Time
- Wait until new instance is ready to receive load
- Prevents unnecessary additional scaling
SAA-C03 Exam Focus Points
- ✅ Policy Selection: Target Tracking recommended for most, Step Scaling for fine control
- ✅ Simple vs Step: Simple has cooldown, Step responds immediately
- ✅ Predictive Scaling: ML-based, requires minimum 24 hours of data
- ✅ Scheduled Scaling: Proactive response to predictable patterns
- ✅ Health Checks: Enable ELB health check when using ELB
- ✅ Multiple Policy Conflict: Availability first (maximum capacity policy applies)
Exam Tip
Sample Exam Question: "A web application needs to automatically scale while maintaining average CPU utilization at 60%. What's the most suitable Auto Scaling policy?" → Answer: Target Tracking Scaling (auto-maintains target metric, recommended policy)
Frequently Asked Questions
Q: Should I use Target Tracking or Step Scaling?
Use Target Tracking for most cases. It's simple to set up and auto-manages CloudWatch alarms. Use Step Scaling when you need different responses based on load magnitude or faster scale-in.
Q: What's the difference between Scheduled and Predictive Scaling?
Scheduled Scaling requires manual schedule configuration, Predictive Scaling auto-learns patterns via ML. Use Scheduled when you know exact traffic patterns, Predictive when patterns are variable.
Q: Can I use multiple policies simultaneously?
Yes. Multiple policies can be used together. On conflict, scale-out uses maximum capacity, scale-in uses minimum reduction—always prioritizing availability.
Q: How do I handle preparation time for new instances?
Set instance warmup time. During warmup, new instances are excluded from ASG metric calculations, preventing unnecessary additional scaling.
Q: Scale-in happens too fast and triggers scale-out again—what should I do?
Increase scale-in cooldown or use Target Tracking. Target Tracking is conservative during scale-in by default, mitigating this issue.
Q: How accurate is Predictive Scaling?
High accuracy when historical data is sufficient and patterns are consistent. However, for unpredictable events (viral traffic, etc.), combine with Target Tracking.
Related Posts
- ELB Types Comparison (ALB, NLB, GLB, CLB)
- EC2 Pricing Comparison (On-Demand, Reserved, Spot, Savings Plans)
- CloudWatch Metrics and Alarms