SAABlog
StorageIntermediate

AWS Storage Gateway: Connecting On-Premises to Cloud with Hybrid Storage

File Gateway vs Volume Gateway vs Tape Gateway differences, DataSync comparison, and SAA-C03 exam essentials explained.

PHILOLAMB-
Storage GatewayHybrid CloudFile GatewayVolume GatewayTape Gateway

Related Exam Domains

  • Domain 2: Design Resilient Architectures
  • Domain 3: Design High-Performing Architectures

Key Takeaway

AWS Storage Gateway is a hybrid storage service that connects on-premises environments to AWS cloud storage. File Gateway provides NFS/SMB access to S3, Volume Gateway offers iSCSI block storage, and Tape Gateway delivers a virtual tape library.

Exam Tip

Exam Essential: "On-premises NFS/SMB → S3" → File Gateway, "On-premises block storage backup" → Volume Gateway, "Replace physical tape backup" → Tape Gateway


When Should You Use Storage Gateway?

Best For

Storage Gateway Recommended Scenarios:
├── Expand on-premises storage capacity
│   └── Local cache + unlimited cloud storage
├── Hybrid cloud architecture
│   └── On-premises apps accessing AWS storage
├── Gradual cloud migration
│   └── Minimal application changes
├── Backup and disaster recovery
│   └── Back up on-premises data to AWS
└── Tape backup modernization
    └── Physical tapes → Virtual tapes (S3/Glacier)

Not Ideal For

Cases Where Storage Gateway Isn't the Best Fit:
├── Large-scale one-time migration
│   → Use AWS DataSync or Snow Family
├── Pure cloud workloads
│   → Use S3, EFS, EBS directly
├── B2B file transfers (SFTP/FTP)
│   → Use AWS Transfer Family
└── Real-time data synchronization
    → Use AWS DataSync

Storage Gateway Types

Three Gateway Types Compared

┌─────────────────────────────────────────────────────────────┐
│                   AWS Storage Gateway                        │
├──────────────────┬──────────────────────────────────────────┤
│                  │                                          │
│  S3 File Gateway │  NFS/SMB → Amazon S3                     │
│  ───────────────  │  • Files stored as S3 objects           │
│                  │  • Local cache for low-latency access    │
│                  │  • S3 Lifecycle policies supported       │
│                  │                                          │
├──────────────────┼──────────────────────────────────────────┤
│                  │                                          │
│  FSx File Gateway│  SMB → Amazon FSx for Windows            │
│  ────────────────│  • Extends Windows file servers          │
│                  │  • Active Directory integration          │
│                  │                                          │
├──────────────────┼──────────────────────────────────────────┤
│                  │                                          │
│  Volume Gateway  │  iSCSI → Amazon S3 + EBS Snapshots       │
│  ──────────────  │  • Cached: S3 storage + local cache      │
│                  │  • Stored: Local storage + S3 backup     │
│                  │                                          │
├──────────────────┼──────────────────────────────────────────┤
│                  │                                          │
│  Tape Gateway    │  iSCSI VTL → S3 + Glacier               │
│  ────────────    │  • Virtual Tape Library                  │
│                  │  • Compatible with backup software       │
│                  │                                          │
└──────────────────┴──────────────────────────────────────────┘

S3 File Gateway

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     S3 File Gateway                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   [On-Premises Servers]                                      │
│   (NFS/SMB Clients)                                         │
│       │                                                      │
│       ▼                                                      │
│   ┌──────────────────────┐                                  │
│   │   Storage Gateway    │  ← On-premises or EC2            │
│   │   (File Gateway)     │                                  │
│   │   ┌────────────────┐ │                                  │
│   │   │  Local Cache   │ │  ← Cache for frequently accessed │
│   │   └────────────────┘ │                                  │
│   └──────────┬───────────┘                                  │
│              │ HTTPS                                         │
│              ▼                                               │
│   ┌──────────────────────┐                                  │
│   │     Amazon S3        │  ← Files stored as S3 objects    │
│   │  (Primary Storage)   │                                  │
│   └──────────────────────┘                                  │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Features

FeatureDetails
ProtocolsNFS v3/v4.1, SMB v2/v3
StorageAmazon S3 (all storage classes)
Max File Size5TB
Local CacheCaches frequently accessed data
IntegrationS3 Lifecycle, S3 Object Lock, IAM

Use Cases

S3 File Gateway Use Cases:
├── File server capacity expansion
│   └── When on-premises storage runs out
├── Data lake ingestion
│   └── On-premises data → S3 → Athena/EMR analysis
├── Cloud migration preparation
│   └── Move data without application changes
└── Backup and archiving
    └── S3 Lifecycle auto-transitions to Glacier

Exam Tip

Exam Keywords: "Extend on-premises NFS file server to S3", "Store files as S3 objects", "Apply Lifecycle policies" → S3 File Gateway


Volume Gateway

Cached vs Stored Modes

┌─────────────────────────────────────────────────────────────┐
│                     Volume Gateway                           │
├────────────────────────────┬────────────────────────────────┤
│      Cached Volume         │       Stored Volume            │
├────────────────────────────┼────────────────────────────────┤
│                            │                                │
│   [On-Premises]            │   [On-Premises]                │
│       │                    │       │                        │
│       ▼                    │       ▼                        │
│   ┌────────────┐           │   ┌────────────┐               │
│   │Local Cache │ (Hot      │   │Full Dataset│ (All data)    │
│   │ (Hot Data) │  data)    │   │  (Primary) │               │
│   └────────────┘           │   └────────────┘               │
│       │                    │       │                        │
│       ▼                    │       ▼                        │
│   ┌────────────┐           │   ┌────────────┐               │
│   │ Amazon S3  │ (Full     │   │ Amazon S3  │ (EBS snapshot │
│   │ (Primary)  │  dataset) │   │ (Backup)   │  backup)      │
│   └────────────┘           │   └────────────┘               │
│                            │                                │
│   Capacity: 32TB/volume    │   Capacity: 16TB/volume        │
│   1PB per gateway          │   512TB per gateway            │
│                            │                                │
└────────────────────────────┴────────────────────────────────┘

Comparison Table

AspectCached VolumeStored Volume
Data LocationS3 (primary), Local (cache)Local (primary), S3 (backup)
LatencyLow on cache hitAlways low
CapacityUp to 32TB per volumeUp to 16TB per volume
Use CaseLarge data, subset frequently accessedFull dataset needs local access
DRRestore from S3 to EBSRestore from snapshots to EBS

Exam Tip

Cached vs Stored Selection Criteria:

  • "Only subset of data frequently accessed" → Cached Volume
  • "All data needs low-latency access" → Stored Volume
  • "Minimize storage costs" → Cached Volume (minimizes local storage)

Tape Gateway

Architecture

┌─────────────────────────────────────────────────────────────┐
│                       Tape Gateway                           │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   [Backup Applications]                                      │
│   (Veeam, Veritas, Commvault, etc.)                         │
│       │                                                      │
│       ▼ iSCSI                                               │
│   ┌──────────────────────┐                                  │
│   │   Storage Gateway    │                                  │
│   │   (Tape Gateway)     │                                  │
│   │   ┌────────────────┐ │                                  │
│   │   │ Virtual Tape   │ │  ← Virtual Tape Library          │
│   │   │ Library (VTL)  │ │                                  │
│   │   └────────────────┘ │                                  │
│   └──────────┬───────────┘                                  │
│              │                                               │
│       ┌──────┴──────┐                                       │
│       ▼             ▼                                       │
│   ┌────────┐   ┌────────────┐                              │
│   │   S3   │   │  Glacier/  │                              │
│   │(Virtual│   │Deep Archive│                              │
│   │ Tapes) │   │ (Archived) │                              │
│   └────────┘   └────────────┘                              │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Features

FeatureDetails
ProtocoliSCSI (Virtual Tape Library)
Virtual Tape Capacity100GB to 15TB per tape
Max Tapes1,500 (total 1PB)
Archive StorageS3 Glacier, S3 Glacier Deep Archive
CompatibilityVeeam, Veritas, Commvault, etc.

Exam Tip

Tape Gateway Exam Points: "Maintain existing tape backup infrastructure", "No backup software changes", "Reduce physical tape costs" → Tape Gateway


Storage Gateway vs DataSync vs Transfer Family

Comparison Table

AspectStorage GatewayDataSyncTransfer Family
PurposeHybrid storage accessData movement/migrationB2B file transfers
ProtocolsNFS, SMB, iSCSINFS, SMB, S3, EFSSFTP, FTPS, FTP
Data FlowContinuous access (bidirectional)One-time/scheduled transfersFile upload/download
CachingLocal cache supportedNoneNone
Use CaseHybrid apps, DRMigration, syncPartner file exchange
DestinationsS3, FSx, EBSS3, EFS, FSxS3, EFS

Decision Flow

Need to move data between on-premises and AWS?
        │
        ▼
Need continuous hybrid access?
        │
       Yes → What type: file/block/tape?
        │           │
        │          File (NFS/SMB) → [S3 File Gateway]
        │          Block (iSCSI) → [Volume Gateway]
        │          Tape (VTL) → [Tape Gateway]
        │
       No
        │
        ▼
Large-scale migration or scheduled sync?
        │
       Yes → [AWS DataSync]
        │
       No
        │
        ▼
SFTP/FTP file exchange with external partners?
        │
       Yes → [AWS Transfer Family]
        │
       No → [Use S3 directly or Snow Family]

Exam Tip

Key Distinctions:

  • "Continuous access + local cache" → Storage Gateway
  • "One-time migration + high-speed transfer" → DataSync
  • "Replace SFTP/FTP servers" → Transfer Family

Deployment Options

Gateway Hosting Locations

Storage Gateway Deployment Options:
├── On-premises virtual machine
│   ├── VMware ESXi
│   ├── Microsoft Hyper-V
│   └── Linux KVM
├── Physical hardware appliance
│   └── Pre-configured server from AWS
└── Amazon EC2 instance
    └── Run gateway within AWS

Pricing Structure

Key Cost Components

Gateway TypeBilling Items
S3 File GatewayS3 requests + S3 storage
FSx File GatewayFSx storage + requests
Volume GatewayS3 storage + snapshots
Tape GatewayVirtual tape storage + Glacier

Cost Optimization Tips

Cost Reduction Strategies:
├── Use S3 Lifecycle policies
│   └── Auto-transition old data → Glacier
├── Use Cached Volume mode
│   └── Reduce local storage costs
├── Set appropriate cache size
│   └── Too large = cost increase, too small = performance hit
└── Clean up unnecessary snapshots
    └── Delete old EBS snapshots

SAA-C03 Exam Focus Points

Commonly Tested Scenarios

  1. File Server Expansion: "On-premises SMB file server running out of space" → S3 File Gateway
  2. Cached vs Stored: "Only subset of data frequently accessed" → Cached Volume
  3. Tape Modernization: "Keep existing backup software, replace tapes" → Tape Gateway
  4. Gateway vs DataSync: "Continuous access" → Gateway, "Migration" → DataSync
  5. NFS vs iSCSI: NFS/SMB → File Gateway, iSCSI → Volume Gateway

Sample Exam Questions

Exam Tip

Sample Exam Question 1: "A company runs an SMB file server on-premises. Recently created files are frequently accessed, but older files are rarely used. How can they optimize storage costs while maintaining low-latency access?"

→ Answer: S3 File Gateway + S3 Lifecycle policy (older files → Glacier)

Exam Tip

Sample Exam Question 2: "A company wants to back up to AWS using their existing backup application (Veeam) instead of physical tapes. They don't want to change their backup software. What should they use?"

→ Answer: Tape Gateway (VTL maintains existing backup workflow)

Exam Tip

Sample Exam Question 3: "An on-premises application uses iSCSI volumes. It needs low-latency access to the entire dataset and asynchronous backup to AWS."

→ Answer: Volume Gateway (Stored Volume) (local storage + S3 backup)


Frequently Asked Questions

Q: Can Storage Gateway and DataSync be used together?

Yes. A common pattern is to use DataSync for initial fast migration, then Storage Gateway for ongoing hybrid access.

Q: Can files stored via File Gateway be accessed via S3 API?

Yes. Files stored through File Gateway are stored as S3 objects, so they can be accessed directly via S3 API, Athena, EMR, etc. However, if objects are modified outside the gateway, you need to call the RefreshCache API.

Q: What happens on a cache miss in Volume Gateway Cached mode?

Data is fetched from S3, resulting in increased latency. It's important to size the cache larger than your frequently accessed data.

Q: How long does it take to restore archived tapes in Tape Gateway?

  • S3 Glacier: 3-5 hours (Standard), 1-5 minutes (Expedited)
  • S3 Glacier Deep Archive: 12-48 hours

Q: Which regions support Storage Gateway?

Storage Gateway is available in most AWS regions. The S3 bucket connected to the gateway can be in the same or a different region.


References