SAABlog
ComputeIntermediate

Auto Scaling Policies: Target Tracking vs Step vs Scheduled - When to Use What?

Target Tracking auto-maintains target metrics, Step Scaling responds by severity, Scheduled Scaling handles predictable patterns. Master Auto Scaling policy selection for SAA-C03.

PHILOLAMB-Updated: January 31, 2026
Auto ScalingEC2High AvailabilityElasticityCost Optimization

Related Exam Domains

  • Domain 2: Design Resilient Architectures
  • Domain 3: Design High-Performing Architectures

Key Takeaway

Use Target Tracking for most cases. Set a target metric (e.g., CPU 50%) and it automatically adjusts instance count. Step Scaling for severity-based responses, Scheduled Scaling for predictable patterns, Predictive Scaling for ML-based pre-scaling.

Exam Tip

Exam Essential: Target Tracking = Recommended + automatic, Step = Staged + CloudWatch alarms, Scheduled = Reserved + known patterns, Predictive = ML + pre-scaling

What is an Auto Scaling Group?

An Auto Scaling Group (ASG) is a collection of EC2 instances with identical configuration that automatically adjusts instance count based on traffic.

Auto Scaling Group Structure:

         ┌─────────────────────────────────────┐
         │        Auto Scaling Group           │
         │                                     │
         │   Min: 2  │  Desired: 4  │  Max: 10 │
         │                                     │
         │  ┌───┐ ┌───┐ ┌───┐ ┌───┐          │
         │  │EC2│ │EC2│ │EC2│ │EC2│  ← Current: 4 │
         │  └───┘ └───┘ └───┘ └───┘          │
         └─────────────────────────────────────┘
                        ↑
              Scaling policy determines instance count

Key ASG Settings

SettingDescription
Minimum Capacity (Min)Minimum instances always maintained
Desired CapacityCurrent target instance count
Maximum Capacity (Max)Maximum instances allowed

Auto Scaling Policy Types at a Glance

Policy TypeHow It WorksBest ForComplexity
Target TrackingAuto-maintains target metricMost cases (recommended)Low
Step ScalingStaged response by alarm severityFine-grained controlMedium
Simple ScalingSingle adjustment + cooldownSimple requirementsLow
Scheduled ScalingAdjusts at scheduled timesPredictable patternsLow
Predictive ScalingML-based pre-scalingCyclical traffic patternsLow

Target Tracking Scaling

How It Works

Target Tracking works like a thermostat. Set a target metric (e.g., CPU 50%) and Auto Scaling automatically adjusts instance count to maintain that value.

Target Tracking Example:

Target: CPU utilization 50%

Current CPU 70% → Add instances (scale out)
Current CPU 30% → Remove instances (scale in)
Current CPU 50% → Maintain

Supported Metrics

Predefined Metrics:

  • ASGAverageCPUUtilization - Average CPU utilization
  • ASGAverageNetworkIn - Average network in
  • ASGAverageNetworkOut - Average network out
  • ALBRequestCountPerTarget - ALB requests per target

Custom Metrics:

  • Application-specific CloudWatch metrics supported

Target Tracking Advantages

  • Simple setup: Just specify target value
  • Auto CloudWatch alarms: Alarms auto-created/managed
  • Conservative scale-in: Prioritizes availability
  • Adaptive adjustment: Auto-adapts to traffic patterns

Exam Tip

Exam Point: Target Tracking is conservative during scale-in. Removes instances gradually to prioritize availability during traffic fluctuations.

When Target Tracking is Suitable

  • Most web applications
  • CPU/network-based workloads
  • Applications behind ALB
  • Need simple, effective scaling

Step Scaling

How It Works

Step Scaling performs different adjustments based on alarm breach severity. Create CloudWatch alarms first, then define staged adjustments by alarm severity.

Step Scaling Example:

CloudWatch Alarm: CPU utilization

CPU 50-60%  → Add 2 instances
CPU 60-75%  → Add 4 instances
CPU 75-90%  → Add 6 instances
CPU > 90%   → Add 10 instances

Step Scaling vs Target Tracking

AspectTarget TrackingStep Scaling
Alarm ManagementAutoManual
Adjustment SizeAuto-calculatedStaged specification
Scale-inConservative (gradual)Immediate
Setup ComplexityLowMedium
Fine ControlLimitedAvailable

When Step Scaling is Suitable

  • Different responses needed based on load magnitude
  • Need fast scale-in
  • Require fine-grained adjustment control
  • Mixed with Target Tracking (advanced)

Simple Scaling

How It Works

Simple Scaling makes a fixed adjustment when alarm triggers, then blocks additional adjustments during cooldown period.

Simple Scaling Flow:

Alarm triggers → Add 2 instances → Cooldown (300 sec) → Can scale again
                                        ↑
                            Scaling blocked during this period

Simple Scaling Limitations

  • Cannot respond to traffic spikes during cooldown
  • Less flexible than Step Scaling
  • AWS recommends Step Scaling or Target Tracking

Exam Tip

Exam Point: Simple Scaling has a cooldown period preventing consecutive adjustments. Use Step Scaling for faster response.

Scheduled Scaling

How It Works

Scheduled Scaling adjusts capacity at predetermined times based on predictable traffic patterns.

Scheduled Scaling Example:

Mon-Fri:
  08:00 → Increase min capacity to 10 (work start)
  20:00 → Decrease min capacity to 2 (work end)

Sat-Sun:
  All day → Maintain min capacity at 2

When Scheduled Scaling is Suitable

  • Fixed business hour patterns
  • Planned marketing events
  • Batch processing time slots
  • Known peak traffic times

Configuration Example

Schedule: Daily 8:00 AM
Min: 5, Max: 20, Desired: 10

Schedule: Daily 8:00 PM
Min: 2, Max: 10, Desired: 2

Predictive Scaling

How It Works

Predictive Scaling uses machine learning to analyze historical traffic patterns and predict future capacity to pre-scale.

Predictive Scaling Flow:

Analyze past 2 weeks of data
        ↓
Learn daily/weekly patterns
        ↓
Predict demand for next 2 days
        ↓
Pre-scale before traffic increase

Predictive Scaling Features

  • Historical data required: Minimum 24 hours (recommended: 14 days)
  • Prediction cycle: Updated daily, predicts 2 days ahead
  • Combine with dynamic scaling: Can use with Target Tracking

When Predictive Scaling is Suitable

  • Cyclical traffic patterns (business hours, weekend patterns)
  • Repetitive on/off workloads
  • Applications with long initialization time
  • Need to minimize latency through pre-scaling

Exam Tip

Exam Point: Predictive Scaling is not suitable for new applications. Requires minimum 24 hours of historical data; insufficient data reduces prediction accuracy.

Combining Scaling Policies

1. Target Tracking + Predictive Scaling

Predictive: Pre-scale for predicted traffic
Target Tracking: Handle actual load that differs from prediction

2. Target Tracking + Scheduled Scaling

Scheduled: Increase capacity for known events
Target Tracking: Additional adjustment based on actual load during event

3. Target Tracking (scale-out) + Step Scaling (scale-in)

Target Tracking: Smooth expansion on load increase
Step Scaling: Fast reduction on load decrease

Multiple Policy Conflict Behavior

Scale-out: Policy providing largest capacity wins
Scale-in: Policy maintaining most instances wins

→ Always prioritizes availability

Scenario-Based Policy Selection

Scenario 1: General Web Application

Requirement: CPU-based auto scaling
Recommended: Target Tracking (CPU 50%)
Reason: Simple and effective, sufficient for most cases

Scenario 2: E-commerce Black Friday

Requirement: Massive traffic spike for scheduled event
Recommended: Scheduled Scaling + Target Tracking
Reason: Pre-scale before event + respond to actual load

Scenario 3: Call Center Application

Requirement: Business hours (9-6) traffic pattern
Recommended: Predictive Scaling + Target Tracking
Reason: ML learns pattern + handles exceptions

Scenario 4: Game Server (Rapid Load Changes)

Requirement: Fast response based on load magnitude
Recommended: Step Scaling
Reason: Different adjustment sizes by stage possible

Scenario 5: Batch Processing

Requirement: Large-scale processing daily at 2 AM
Recommended: Scheduled Scaling
Reason: Scale up before processing starts, scale down after completion

ASG Health Checks

Auto Scaling automatically replaces unhealthy instances.

Health Check Types

TypeChecksWhen to Use
EC2EC2 instance statusDefault
ELBLoad balancer health checkRecommended with ELB
CustomExternal health check systemAdvanced scenarios

Exam Tip

Exam Point: When using with ELB, enable ELB health check. EC2 health check alone cannot detect application-level issues.

Cost Optimization Tips

1. Set Appropriate Minimum Capacity

  • Minimum instances to handle normal load
  • Too high = wasted cost, too low = startup delay

2. Mix Spot Instances

  • Mixed instance policy combines Spot + On-Demand
  • Balance cost savings and availability

3. Set Instance Warmup Time

  • Wait until new instance is ready to receive load
  • Prevents unnecessary additional scaling

SAA-C03 Exam Focus Points

  1. Policy Selection: Target Tracking recommended for most, Step Scaling for fine control
  2. Simple vs Step: Simple has cooldown, Step responds immediately
  3. Predictive Scaling: ML-based, requires minimum 24 hours of data
  4. Scheduled Scaling: Proactive response to predictable patterns
  5. Health Checks: Enable ELB health check when using ELB
  6. Multiple Policy Conflict: Availability first (maximum capacity policy applies)

Exam Tip

Sample Exam Question: "A web application needs to automatically scale while maintaining average CPU utilization at 60%. What's the most suitable Auto Scaling policy?" → Answer: Target Tracking Scaling (auto-maintains target metric, recommended policy)

Frequently Asked Questions

Q: Should I use Target Tracking or Step Scaling?

Use Target Tracking for most cases. It's simple to set up and auto-manages CloudWatch alarms. Use Step Scaling when you need different responses based on load magnitude or faster scale-in.

Q: What's the difference between Scheduled and Predictive Scaling?

Scheduled Scaling requires manual schedule configuration, Predictive Scaling auto-learns patterns via ML. Use Scheduled when you know exact traffic patterns, Predictive when patterns are variable.

Q: Can I use multiple policies simultaneously?

Yes. Multiple policies can be used together. On conflict, scale-out uses maximum capacity, scale-in uses minimum reduction—always prioritizing availability.

Q: How do I handle preparation time for new instances?

Set instance warmup time. During warmup, new instances are excluded from ASG metric calculations, preventing unnecessary additional scaling.

Q: Scale-in happens too fast and triggers scale-out again—what should I do?

Increase scale-in cooldown or use Target Tracking. Target Tracking is conservative during scale-in by default, mitigating this issue.

Q: How accurate is Predictive Scaling?

High accuracy when historical data is sufficient and patterns are consistent. However, for unpredictable events (viral traffic, etc.), combine with Target Tracking.



References