SAABlog
Cost ManagementIntermediate

Lambda Cost Optimization: From Memory Tuning to ARM Migration

Learn effective AWS Lambda cost reduction strategies including memory optimization, ARM migration, and Power Tuning.

PHILOLAMB-Updated: January 31, 2026
LambdaCost OptimizationServerlessGravitonMemory Configuration

Related Exam Domains

  • Domain 4: Design Cost-Optimized Architectures

Key Takeaway

Lambda costs are determined by "memory × execution time." The key is finding the optimal memory-performance balance, not just minimizing memory. Use Power Tuning to find optimal memory, and switch to ARM (Graviton2) for an additional 20% savings.

Exam Tip

Exam Essential: "Lambda memory increase → CPU proportionally increases → Execution time decreases → Total cost stays same or decreases"

Understanding Lambda Pricing

Lambda costs consist of 3 components:

ComponentPricingFree Tier
Requests$0.20/million1 million/month
Compute Time$0.0000166667/GB-second400,000 GB-seconds/month
Provisioned Concurrency$0.0000041667/GB-second-

Cost Calculation Formula

Lambda Cost = Request charge + Compute charge

Compute charge = Memory(GB) × Execution time(seconds) × Unit price

Example: 256MB memory, 200ms execution, 1 million requests/month
- Requests: 1M × $0.0000002 = $0.20
- Compute: 0.25GB × 0.2sec × 1M × $0.0000166667 = $0.83
- Total: $1.03/month

Exam Tip

1ms billing granularity: Changed from 100ms to 1ms units, significantly reducing costs for short-running functions.

Memory and CPU Relationship

In Lambda, increasing memory proportionally increases CPU performance.

Memory ↑ = CPU ↑ = Network bandwidth ↑

128MB  → Minimum CPU
256MB  → 2× CPU
512MB  → 4× CPU
1024MB → 8× CPU (approx. 0.5 vCPU)
1769MB → 1 vCPU
10240MB → 6 vCPU

Lower Memory Isn't Always Cheaper

Example: Data processing Lambda function

128MB (minimum memory):
- Execution time: 2000ms
- Cost: 0.128 × 2.0 × $0.0000166667 = $0.00000427

512MB:
- Execution time: 500ms (4× faster)
- Cost: 0.512 × 0.5 × $0.0000166667 = $0.00000427

→ Same cost, but 512MB is 4× faster response!

Exam Tip

Core Principle: CPU-bound tasks may cost the same or less with higher memory. I/O-bound tasks benefit from lower memory.

Strategy 1: Power Tuning

AWS Lambda Power Tuning is an open-source tool that automatically tests your function across various memory settings to find the optimal cost-performance balance.

How It Works

Runs via Step Functions:
┌──────────────────────────────────────────────┐
│  128MB → Measure                             │
│  256MB → Measure                             │
│  512MB → Measure       → Optimal value report│
│  1024MB → Measure         (cost vs speed)    │
│  2048MB → Measure                            │
│  3008MB → Measure                            │
└──────────────────────────────────────────────┘

Interpreting Results

Memory    Duration    Cost
128MB     2000ms    $0.00000427  ← Slowest
256MB     1000ms    $0.00000427
512MB     500ms     $0.00000427  ← Optimal (same cost, 4× faster)
1024MB    480ms     $0.00000819  ← Cost increase, minimal perf gain

→ In this case, 512MB is optimal

Strategy 2: ARM (Graviton2) Migration

ARM architecture Lambda is 20% cheaper and up to 34% better performance than x86.

Itemx86ARM (Graviton2)
Architecturex86_64arm64
PriceBaseline20% cheaper
PerformanceBaselineUp to 34% better

ARM Migration Considerations

  1. Native binaries: C/C++, Rust need to be rebuilt for ARM
  2. Interpreted languages: Python, Node.js, Java are mostly compatible
  3. Lambda Layers: May need ARM-specific layers

Exam Tip

Exam Point: Lambda Graviton2 (ARM) = 20% cost reduction + performance improvement vs x86

Strategy 3: Reduce Execution Time

Minimize Cold Starts

StrategyMethodCost Impact
Provisioned ConcurrencyPre-warm instancesAdditional cost
SnapStartJVM snapshot (Java)Free
Minimal PackageRemove unnecessary SDKsFree
Run Outside VPCRemove VPC if not neededFree

Code Optimization

# Bad: Initialize inside handler every time
def handler(event, context):
    import boto3  # Imports every time
    client = boto3.client('dynamodb')  # Creates every time
    return client.get_item(...)

# Good: Initialize outside handler (reuse)
import boto3
client = boto3.client('dynamodb')  # Created once on cold start

def handler(event, context):
    return client.get_item(...)  # Reused on warm starts

Strategy 4: Cost Management

Tiered Pricing

Discounts automatically apply at high volume:

TierPrice (GB-second)
0 - 6 billion GB-sec$0.0000166667
6B - 15B GB-sec$0.0000150000 (10% discount)
Over 15B GB-sec$0.0000133334 (20% discount)

Savings Plans

Compute Savings Plans can be applied to Lambda:

  • 1-year or 3-year commitment
  • Up to 17% additional savings

Eliminate Unnecessary Invocations

Cost Reduction Checklist:
☑ Optimize CloudWatch log levels (DEBUG → INFO)
☑ Remove unnecessary API calls
☑ Use DynamoDB batch operations (PutItem → BatchWriteItem)
☑ SQS batch processing (receive 10 messages at a time)
☑ Utilize /tmp directory caching

Cost Monitoring

CloudWatch Metrics

MetricPurpose
DurationTrack execution time
ConcurrentExecutionsConcurrent execution count
ThrottlesThrottling occurrence count
MemorySize vs MaxMemoryUsedDetect memory over-provisioning

Using Cost Explorer

Lambda Cost Analysis:
1. Cost Explorer → Filter by service → Lambda
2. Analyze costs by tag (environment, project)
3. Check daily/weekly trends

SAA-C03 Exam Focus Points

  1. Memory-CPU relationship: "Memory increase → CPU proportionally increases"
  2. Cost optimization: "Power Tuning to find optimal memory"
  3. ARM migration: "Graviton2 for 20% cost reduction"
  4. Pricing structure: "Requests + (Memory × Execution time)"
  5. Cold start: "Provisioned Concurrency = extra cost, SnapStart = free"

Exam Tip

Sample Exam Question: "How to reduce Lambda function costs while maintaining performance?" → Answer: Switch to ARM (Graviton2) architecture (20% cheaper + better performance)

Frequently Asked Questions (FAQ)

Q: Is setting Lambda memory to minimum (128MB) always cheapest?

No. For CPU-bound tasks, increasing memory reduces execution time, keeping total cost the same or lower. Use Power Tuning to find the optimal value.

Q: When should I use Provisioned Concurrency?

Use for latency-sensitive APIs where cold starts are unacceptable. It incurs additional costs, so try Java/SnapStart or package optimization first.

Q: Which is cheaper, Lambda or EC2?

Lambda is cheaper for short execution times and irregular traffic. EC2 (Reserved) is cheaper for 24/7 continuous execution and high traffic.

Q: How much is Lambda Free Tier?

1 million requests + 400,000 GB-seconds of compute are free monthly. This is provided permanently (no 12-month limit).

Q: Does ARM migration require code changes?

Interpreted languages like Python, Node.js mostly work without changes. Layers containing native binaries need to be rebuilt for ARM.

References