AWS Storage Gateway: Connecting On-Premises to Cloud with Hybrid Storage
File Gateway vs Volume Gateway vs Tape Gateway differences, DataSync comparison, and SAA-C03 exam essentials explained.
Related Exam Domains
- Domain 2: Design Resilient Architectures
- Domain 3: Design High-Performing Architectures
Key Takeaway
AWS Storage Gateway is a hybrid storage service that connects on-premises environments to AWS cloud storage. File Gateway provides NFS/SMB access to S3, Volume Gateway offers iSCSI block storage, and Tape Gateway delivers a virtual tape library.
Exam Tip
Exam Essential: "On-premises NFS/SMB → S3" → File Gateway, "On-premises block storage backup" → Volume Gateway, "Replace physical tape backup" → Tape Gateway
When Should You Use Storage Gateway?
Best For
Storage Gateway Recommended Scenarios:
├── Expand on-premises storage capacity
│ └── Local cache + unlimited cloud storage
├── Hybrid cloud architecture
│ └── On-premises apps accessing AWS storage
├── Gradual cloud migration
│ └── Minimal application changes
├── Backup and disaster recovery
│ └── Back up on-premises data to AWS
└── Tape backup modernization
└── Physical tapes → Virtual tapes (S3/Glacier)
Not Ideal For
Cases Where Storage Gateway Isn't the Best Fit:
├── Large-scale one-time migration
│ → Use AWS DataSync or Snow Family
├── Pure cloud workloads
│ → Use S3, EFS, EBS directly
├── B2B file transfers (SFTP/FTP)
│ → Use AWS Transfer Family
└── Real-time data synchronization
→ Use AWS DataSync
Storage Gateway Types
Three Gateway Types Compared
┌─────────────────────────────────────────────────────────────┐
│ AWS Storage Gateway │
├──────────────────┬──────────────────────────────────────────┤
│ │ │
│ S3 File Gateway │ NFS/SMB → Amazon S3 │
│ ─────────────── │ • Files stored as S3 objects │
│ │ • Local cache for low-latency access │
│ │ • S3 Lifecycle policies supported │
│ │ │
├──────────────────┼──────────────────────────────────────────┤
│ │ │
│ FSx File Gateway│ SMB → Amazon FSx for Windows │
│ ────────────────│ • Extends Windows file servers │
│ │ • Active Directory integration │
│ │ │
├──────────────────┼──────────────────────────────────────────┤
│ │ │
│ Volume Gateway │ iSCSI → Amazon S3 + EBS Snapshots │
│ ────────────── │ • Cached: S3 storage + local cache │
│ │ • Stored: Local storage + S3 backup │
│ │ │
├──────────────────┼──────────────────────────────────────────┤
│ │ │
│ Tape Gateway │ iSCSI VTL → S3 + Glacier │
│ ──────────── │ • Virtual Tape Library │
│ │ • Compatible with backup software │
│ │ │
└──────────────────┴──────────────────────────────────────────┘
S3 File Gateway
Architecture
┌─────────────────────────────────────────────────────────────┐
│ S3 File Gateway │
├─────────────────────────────────────────────────────────────┤
│ │
│ [On-Premises Servers] │
│ (NFS/SMB Clients) │
│ │ │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ Storage Gateway │ ← On-premises or EC2 │
│ │ (File Gateway) │ │
│ │ ┌────────────────┐ │ │
│ │ │ Local Cache │ │ ← Cache for frequently accessed │
│ │ └────────────────┘ │ │
│ └──────────┬───────────┘ │
│ │ HTTPS │
│ ▼ │
│ ┌──────────────────────┐ │
│ │ Amazon S3 │ ← Files stored as S3 objects │
│ │ (Primary Storage) │ │
│ └──────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Features
| Feature | Details |
|---|---|
| Protocols | NFS v3/v4.1, SMB v2/v3 |
| Storage | Amazon S3 (all storage classes) |
| Max File Size | 5TB |
| Local Cache | Caches frequently accessed data |
| Integration | S3 Lifecycle, S3 Object Lock, IAM |
Use Cases
S3 File Gateway Use Cases:
├── File server capacity expansion
│ └── When on-premises storage runs out
├── Data lake ingestion
│ └── On-premises data → S3 → Athena/EMR analysis
├── Cloud migration preparation
│ └── Move data without application changes
└── Backup and archiving
└── S3 Lifecycle auto-transitions to Glacier
Exam Tip
Exam Keywords: "Extend on-premises NFS file server to S3", "Store files as S3 objects", "Apply Lifecycle policies" → S3 File Gateway
Volume Gateway
Cached vs Stored Modes
┌─────────────────────────────────────────────────────────────┐
│ Volume Gateway │
├────────────────────────────┬────────────────────────────────┤
│ Cached Volume │ Stored Volume │
├────────────────────────────┼────────────────────────────────┤
│ │ │
│ [On-Premises] │ [On-Premises] │
│ │ │ │ │
│ ▼ │ ▼ │
│ ┌────────────┐ │ ┌────────────┐ │
│ │Local Cache │ (Hot │ │Full Dataset│ (All data) │
│ │ (Hot Data) │ data) │ │ (Primary) │ │
│ └────────────┘ │ └────────────┘ │
│ │ │ │ │
│ ▼ │ ▼ │
│ ┌────────────┐ │ ┌────────────┐ │
│ │ Amazon S3 │ (Full │ │ Amazon S3 │ (EBS snapshot │
│ │ (Primary) │ dataset) │ │ (Backup) │ backup) │
│ └────────────┘ │ └────────────┘ │
│ │ │
│ Capacity: 32TB/volume │ Capacity: 16TB/volume │
│ 1PB per gateway │ 512TB per gateway │
│ │ │
└────────────────────────────┴────────────────────────────────┘
Comparison Table
| Aspect | Cached Volume | Stored Volume |
|---|---|---|
| Data Location | S3 (primary), Local (cache) | Local (primary), S3 (backup) |
| Latency | Low on cache hit | Always low |
| Capacity | Up to 32TB per volume | Up to 16TB per volume |
| Use Case | Large data, subset frequently accessed | Full dataset needs local access |
| DR | Restore from S3 to EBS | Restore from snapshots to EBS |
Exam Tip
Cached vs Stored Selection Criteria:
- "Only subset of data frequently accessed" → Cached Volume
- "All data needs low-latency access" → Stored Volume
- "Minimize storage costs" → Cached Volume (minimizes local storage)
Tape Gateway
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Tape Gateway │
├─────────────────────────────────────────────────────────────┤
│ │
│ [Backup Applications] │
│ (Veeam, Veritas, Commvault, etc.) │
│ │ │
│ ▼ iSCSI │
│ ┌──────────────────────┐ │
│ │ Storage Gateway │ │
│ │ (Tape Gateway) │ │
│ │ ┌────────────────┐ │ │
│ │ │ Virtual Tape │ │ ← Virtual Tape Library │
│ │ │ Library (VTL) │ │ │
│ │ └────────────────┘ │ │
│ └──────────┬───────────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ ▼ ▼ │
│ ┌────────┐ ┌────────────┐ │
│ │ S3 │ │ Glacier/ │ │
│ │(Virtual│ │Deep Archive│ │
│ │ Tapes) │ │ (Archived) │ │
│ └────────┘ └────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Features
| Feature | Details |
|---|---|
| Protocol | iSCSI (Virtual Tape Library) |
| Virtual Tape Capacity | 100GB to 15TB per tape |
| Max Tapes | 1,500 (total 1PB) |
| Archive Storage | S3 Glacier, S3 Glacier Deep Archive |
| Compatibility | Veeam, Veritas, Commvault, etc. |
Exam Tip
Tape Gateway Exam Points: "Maintain existing tape backup infrastructure", "No backup software changes", "Reduce physical tape costs" → Tape Gateway
Storage Gateway vs DataSync vs Transfer Family
Comparison Table
| Aspect | Storage Gateway | DataSync | Transfer Family |
|---|---|---|---|
| Purpose | Hybrid storage access | Data movement/migration | B2B file transfers |
| Protocols | NFS, SMB, iSCSI | NFS, SMB, S3, EFS | SFTP, FTPS, FTP |
| Data Flow | Continuous access (bidirectional) | One-time/scheduled transfers | File upload/download |
| Caching | Local cache supported | None | None |
| Use Case | Hybrid apps, DR | Migration, sync | Partner file exchange |
| Destinations | S3, FSx, EBS | S3, EFS, FSx | S3, EFS |
Decision Flow
Need to move data between on-premises and AWS?
│
▼
Need continuous hybrid access?
│
Yes → What type: file/block/tape?
│ │
│ File (NFS/SMB) → [S3 File Gateway]
│ Block (iSCSI) → [Volume Gateway]
│ Tape (VTL) → [Tape Gateway]
│
No
│
▼
Large-scale migration or scheduled sync?
│
Yes → [AWS DataSync]
│
No
│
▼
SFTP/FTP file exchange with external partners?
│
Yes → [AWS Transfer Family]
│
No → [Use S3 directly or Snow Family]
Exam Tip
Key Distinctions:
- "Continuous access + local cache" → Storage Gateway
- "One-time migration + high-speed transfer" → DataSync
- "Replace SFTP/FTP servers" → Transfer Family
Deployment Options
Gateway Hosting Locations
Storage Gateway Deployment Options:
├── On-premises virtual machine
│ ├── VMware ESXi
│ ├── Microsoft Hyper-V
│ └── Linux KVM
├── Physical hardware appliance
│ └── Pre-configured server from AWS
└── Amazon EC2 instance
└── Run gateway within AWS
Pricing Structure
Key Cost Components
| Gateway Type | Billing Items |
|---|---|
| S3 File Gateway | S3 requests + S3 storage |
| FSx File Gateway | FSx storage + requests |
| Volume Gateway | S3 storage + snapshots |
| Tape Gateway | Virtual tape storage + Glacier |
Cost Optimization Tips
Cost Reduction Strategies:
├── Use S3 Lifecycle policies
│ └── Auto-transition old data → Glacier
├── Use Cached Volume mode
│ └── Reduce local storage costs
├── Set appropriate cache size
│ └── Too large = cost increase, too small = performance hit
└── Clean up unnecessary snapshots
└── Delete old EBS snapshots
SAA-C03 Exam Focus Points
Commonly Tested Scenarios
- ✅ File Server Expansion: "On-premises SMB file server running out of space" → S3 File Gateway
- ✅ Cached vs Stored: "Only subset of data frequently accessed" → Cached Volume
- ✅ Tape Modernization: "Keep existing backup software, replace tapes" → Tape Gateway
- ✅ Gateway vs DataSync: "Continuous access" → Gateway, "Migration" → DataSync
- ✅ NFS vs iSCSI: NFS/SMB → File Gateway, iSCSI → Volume Gateway
Sample Exam Questions
Exam Tip
Sample Exam Question 1: "A company runs an SMB file server on-premises. Recently created files are frequently accessed, but older files are rarely used. How can they optimize storage costs while maintaining low-latency access?"
→ Answer: S3 File Gateway + S3 Lifecycle policy (older files → Glacier)
Exam Tip
Sample Exam Question 2: "A company wants to back up to AWS using their existing backup application (Veeam) instead of physical tapes. They don't want to change their backup software. What should they use?"
→ Answer: Tape Gateway (VTL maintains existing backup workflow)
Exam Tip
Sample Exam Question 3: "An on-premises application uses iSCSI volumes. It needs low-latency access to the entire dataset and asynchronous backup to AWS."
→ Answer: Volume Gateway (Stored Volume) (local storage + S3 backup)
Frequently Asked Questions
Q: Can Storage Gateway and DataSync be used together?
Yes. A common pattern is to use DataSync for initial fast migration, then Storage Gateway for ongoing hybrid access.
Q: Can files stored via File Gateway be accessed via S3 API?
Yes. Files stored through File Gateway are stored as S3 objects, so they can be accessed directly via S3 API, Athena, EMR, etc. However, if objects are modified outside the gateway, you need to call the RefreshCache API.
Q: What happens on a cache miss in Volume Gateway Cached mode?
Data is fetched from S3, resulting in increased latency. It's important to size the cache larger than your frequently accessed data.
Q: How long does it take to restore archived tapes in Tape Gateway?
- S3 Glacier: 3-5 hours (Standard), 1-5 minutes (Expedited)
- S3 Glacier Deep Archive: 12-48 hours
Q: Which regions support Storage Gateway?
Storage Gateway is available in most AWS regions. The S3 bucket connected to the gateway can be in the same or a different region.