SAABlog
ComputeBeginner

EC2 User Data and Metadata: Essential Instance Automation

Master EC2 User Data for bootstrap automation and Instance Metadata for retrieving instance information. Essential concepts for SAA-C03 exam preparation.

PHILOLAMB-Updated: January 31, 2026
EC2User DataMetadataBootstrapAutomation

Related Exam Domains

  • Domain 2: Design Resilient Architectures
  • Domain 3: Design High-Performing Architectures

Key Takeaway

User Data is a script that runs automatically when an instance starts, and Metadata is how an instance retrieves information about itself. Both are accessed from within the instance only via the link-local address 169.254.169.254.

Exam Tip

Exam Essential: User Data runs only on first boot by default. Metadata provides IAM role temporary credentials, eliminating the need to hardcode credentials in EC2. IMDSv2 (token-based) is recommended for security.

AspectUser DataInstance Metadata
PurposeRun scripts at bootQuery instance information
ExecutionFirst boot only (default)Query anytime
Max Size16KB-
AccessHTTP from within instanceHTTP from within instance
Endpoint169.254.169.254/latest/user-data169.254.169.254/latest/meta-data/

What is User Data?

Concept

User Data is a script that runs automatically when an EC2 instance starts. This is called bootstrapping.

User Data Use Cases:
├── Software installation (web server, agents, etc.)
├── Package updates
├── File downloads (config files from S3)
├── Service start/enable
├── Environment variable configuration
└── CloudWatch Agent setup

Basic User Data Format

#!/bin/bash
# All User Data scripts start with #!

# Update packages
yum update -y

# Install and start web server
yum install -y httpd
systemctl start httpd
systemctl enable httpd

# Create web page
echo "<h1>Hello from EC2</h1>" > /var/www/html/index.html

Key User Data Characteristics

CharacteristicDescription
Execution PrivilegesRuns as root
Execution TimeFirst boot only (default)
Max Size16KB (gzip compression available)
EncodingBase64 encoded (console handles automatically)
Log Location/var/log/cloud-init-output.log
AMI InclusionUser Data is NOT included in AMI

Exam Tip

Exam Point: User Data is NOT included in AMI. Creating an AMI from an instance does not save that instance's User Data to the new AMI.


Writing User Data

Shell Script Method

#!/bin/bash
# Most common approach

# Log output (for debugging)
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1

echo "User Data execution started: $(date)"

# Install packages
yum update -y
yum install -y docker
systemctl start docker
systemctl enable docker

# Run Docker container
docker run -d -p 80:80 nginx

echo "User Data execution completed: $(date)"

Cloud-Init Directive Method

#cloud-config
# YAML format cloud-init configuration

packages:
  - httpd
  - php

runcmd:
  - systemctl start httpd
  - systemctl enable httpd
  - echo "Hello World" > /var/www/html/index.html

write_files:
  - path: /etc/myapp/config.json
    content: |
      {
        "environment": "production",
        "debug": false
      }

Download Script from S3

#!/bin/bash
# Store large scripts in S3 and download

# AWS CLI is pre-installed on Amazon Linux
aws s3 cp s3://my-bucket/setup-script.sh /tmp/setup.sh
chmod +x /tmp/setup.sh
/tmp/setup.sh

User Data Re-execution

Default Behavior

By default, User Data runs only on first boot. It does not run again on reboot.

To Run on Every Boot

#cloud-config
# cloud-init always run configuration

cloud_final_modules:
  - [scripts-user, always]

Or directly in User Data:

#!/bin/bash

# Delete /var/lib/cloud/instance/sem/ file
rm -f /var/lib/cloud/instance/sem/config_scripts_user

# Script contents...

Exam Tip

Exam Trap: "Modifying User Data causes the new script to run after instance restart" → False. By default, it runs only on first boot. Additional configuration is needed for every-boot execution.


What is Instance Metadata?

Concept

Instance Metadata is a service that allows an EC2 instance to query information about itself.

Queryable Metadata:
├── instance-id         # Instance ID
├── instance-type       # Instance type (t3.micro, etc.)
├── ami-id              # AMI ID
├── hostname            # Hostname
├── local-ipv4          # Private IP
├── public-ipv4         # Public IP
├── placement/
│   └── availability-zone  # Availability Zone
├── security-groups     # Security group names
├── iam/
│   └── security-credentials/<role-name>  # IAM role credentials
└── network/interfaces/ # Network interface info

Querying Metadata

# Query instance ID
curl http://169.254.169.254/latest/meta-data/instance-id

# Query availability zone
curl http://169.254.169.254/latest/meta-data/placement/availability-zone

# Query Public IP
curl http://169.254.169.254/latest/meta-data/public-ipv4

# Query Private IP
curl http://169.254.169.254/latest/meta-data/local-ipv4

# Query instance type
curl http://169.254.169.254/latest/meta-data/instance-type

# List all metadata categories
curl http://169.254.169.254/latest/meta-data/

IMDSv1 vs IMDSv2

Comparison Table

ItemIMDSv1IMDSv2
AuthenticationNone (anyone can access)Token-based
SecurityVulnerable to SSRF attacksSSRF protection
Request MethodSimple GET requestPUT for token → GET for query
AWS RecommendationNot recommendedRecommended

Using IMDSv2

# 1. Get token (PUT request, set TTL)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# 2. Query metadata with token
curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-id

Enforcing IMDSv2

# Require IMDSv2 via AWS CLI
aws ec2 modify-instance-metadata-options \
  --instance-id i-1234567890abcdef0 \
  --http-tokens required \
  --http-endpoint enabled

Exam Tip

Exam Essential: IMDSv2 is recommended for security. Setting --http-tokens required disables IMDSv1 and allows only IMDSv2. This defends against SSRF (Server-Side Request Forgery) attacks.


IAM Roles and Metadata

Accessing AWS Services from EC2

EC2 instances need credentials to access AWS services like S3 and DynamoDB.

Credential Methods (in order of preference):
1. ✅ IAM Role (Instance Profile) - Recommended
2. ❌ Access Key in environment variables - Not recommended
3. ❌ Hardcoded Access Key in code - Never do this

Querying IAM Role Credentials

# Query IAM role name
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/

# Query temporary credentials
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/MyEC2Role

# Response example:
{
  "Code": "Success",
  "AccessKeyId": "ASIA...",
  "SecretAccessKey": "xxx...",
  "Token": "xxx...",
  "Expiration": "2026-01-26T12:00:00Z"
}

AWS SDK Automatic Usage

# Using AWS SDK (boto3) in Python
# If IAM role is attached, credentials are obtained automatically

import boto3

# No need to specify credentials - auto-retrieved from metadata
s3 = boto3.client('s3')
s3.list_buckets()

Exam Tip

Exam Point: AWS SDK automatically retrieves temporary credentials from metadata when using IAM roles. Don't hardcode Access Keys in your code!


Practical User Data Examples

1. Web Server + CloudWatch Agent

#!/bin/bash
# Install web server and configure CloudWatch monitoring

# Basic configuration
REGION=$(curl -s http://169.254.169.254/latest/meta-data/placement/region)
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)

# Install web server
yum update -y
yum install -y httpd amazon-cloudwatch-agent

# Create web page
cat > /var/www/html/index.html << EOF
<h1>Hello from $INSTANCE_ID</h1>
<p>Region: $REGION</p>
EOF

# Start web server
systemctl start httpd
systemctl enable httpd

# Download and start CloudWatch Agent config
aws s3 cp s3://my-config-bucket/cloudwatch-config.json /opt/aws/amazon-cloudwatch-agent/etc/
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
  -a fetch-config \
  -m ec2 \
  -c file:/opt/aws/amazon-cloudwatch-agent/etc/cloudwatch-config.json \
  -s

2. ECS Container Instance Registration

#!/bin/bash
# Register container instance to ECS cluster

echo ECS_CLUSTER=my-ecs-cluster >> /etc/ecs/ecs.config
echo ECS_ENABLE_CONTAINER_METADATA=true >> /etc/ecs/ecs.config

3. Dynamic Configuration in Auto Scaling

#!/bin/bash
# Apply dynamic configuration in Auto Scaling group

# Query instance information
AZ=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
INSTANCE_TYPE=$(curl -s http://169.254.169.254/latest/meta-data/instance-type)

# Apply AZ-specific configuration
case $AZ in
  *a) DB_HOST="db-primary.example.com" ;;
  *b) DB_HOST="db-replica-1.example.com" ;;
  *c) DB_HOST="db-replica-2.example.com" ;;
esac

# Set as environment variable
echo "export DB_HOST=$DB_HOST" >> /etc/environment

Troubleshooting

Verifying User Data Execution

# Check cloud-init log
cat /var/log/cloud-init-output.log

# Check cloud-init status
cloud-init status

# View User Data contents
curl http://169.254.169.254/latest/user-data

Common Issues

ProblemCauseSolution
Script not runningMissing shebang (#!/bin/bash)Add shebang on first line
Permission errorFile/directory permissionsCheck chmod, chown
Package install failsNo network connectionVerify NAT Gateway/IGW
S3 access failsIAM role not attachedCheck Instance Profile
Metadata query failsIMDS disabledSet --http-endpoint enabled

SAA-C03 Exam Focus Points

Common Question Types

TypeKey Point
Execution TimingUser Data runs only on first boot (default)
IAM CredentialsTemporary credentials auto-retrieved from metadata
SecurityIMDSv2 recommended, defends against SSRF
Size LimitUser Data max 16KB
AMI InclusionUser Data is NOT included in AMI
Log Location/var/log/cloud-init-output.log

Common Wrong Answer Traps

❌ Modifying User Data runs new script on reboot
   → By default, runs only on first boot

❌ IMDSv1 and IMDSv2 have same security level
   → IMDSv2 is more secure with token-based auth

❌ Metadata can be queried from outside the instance
   → 169.254.169.254 is only accessible from within instance

❌ IAM role credentials are permanent
   → Temporary credentials, automatically rotated

❌ User Data is included in AMI
   → NOT included

FAQ

Q1: What's the difference between User Data and Launch Template?

User Data is a script that runs when an instance starts. Launch Template is a template of instance settings (AMI, instance type, security groups, User Data, etc.). Launch Templates can include User Data.

Q2: Does the instance fail to start if User Data script fails?

No, the instance starts normally even if User Data script fails. Check script failure status in /var/log/cloud-init-output.log.

Q3: Can I completely disable metadata?

Yes, use --http-endpoint disabled to completely disable IMDS. However, this also disables IAM role credentials.

Q4: Why is 169.254.169.254 used?

It's a Link-Local Address. This address is not routed, making it inaccessible from outside the instance. It's a special address AWS uses for the metadata service.

Q5: What if User Data exceeds 16KB?

Compress with gzip or store the script in S3 and download in User Data. Use #include or #cloud-config-archive format for compression.


Summary

User Data and Metadata are essential for EC2 automation:

  1. User Data: Auto-run scripts at boot, 16KB limit, first boot only
  2. Metadata: Query instance info, obtain IAM credentials
  3. IMDSv2: Token-based authentication, recommended for security
  4. IAM Roles: Use temporary credentials instead of Access Keys

References