Cloud

AWS Cloud Storage Options – S3, EBS, EFS, FSx Explained

AWS offers over a dozen storage services, each built for a specific workload pattern. Picking the wrong one means overpaying, underperforming, or both. This guide breaks down the five core AWS storage services – S3, EBS, EFS, FSx, and Storage Gateway – with clear comparisons so you can match the right service to your workload without second-guessing.

Original content from computingforgeeks.com - post 28259

Whether you are storing application logs, running a database, sharing files across containers, or connecting on-premises systems to the cloud, AWS has a purpose-built storage option. The key is understanding the access patterns, performance characteristics, and pricing models before you architect anything. We cover all of that below, along with a decision framework and detailed comparison table. For the full catalog of AWS storage offerings, see the AWS Storage services page.

Amazon S3 – Object Storage

Amazon Simple Storage Service (S3) is the foundational object storage service on AWS. It stores data as objects inside buckets, where each object can be up to 5 TB in size. S3 is not a file system – there is no directory hierarchy, no file locking, no append operations. You PUT an object, GET an object, or DELETE an object. That simplicity is exactly what makes it scale to virtually unlimited capacity without you managing any infrastructure.

S3 delivers 99.999999999% (eleven 9s) durability by automatically replicating objects across at least three Availability Zones within a region. For availability, the standard tier provides 99.99% uptime SLA.

S3 Storage Classes

S3 offers multiple storage classes to optimize cost based on how frequently you access data:

  • S3 Standard – for frequently accessed data. Low latency, high throughput. Best for active application data, content distribution, and analytics
  • S3 Intelligent-Tiering – automatically moves objects between frequent and infrequent access tiers based on usage patterns. Small monthly monitoring fee per object, but no retrieval charges
  • S3 Standard-IA (Infrequent Access) – lower storage cost than Standard, but charges per-GB retrieval fee. Good for backups, disaster recovery copies, and data accessed less than once a month
  • S3 One Zone-IA – same as Standard-IA but stored in a single AZ. 20% cheaper, but data is lost if that AZ is destroyed. Use for reproducible data like thumbnails or transcoded media
  • S3 Glacier Instant Retrieval – archive storage with millisecond retrieval. Ideal for data accessed once a quarter but needs immediate access when requested
  • S3 Glacier Flexible Retrieval – archive storage with retrieval times from minutes to hours. Costs significantly less than Instant Retrieval
  • S3 Glacier Deep Archive – lowest cost storage class in AWS. Retrieval takes 12-48 hours. Designed for compliance archives and data you rarely touch

S3 Lifecycle Policies

Lifecycle rules automate transitions between storage classes and expiration of old objects. You define rules at the bucket or prefix level. A typical policy might keep objects in S3 Standard for 30 days, transition to Standard-IA at day 30, move to Glacier Flexible Retrieval at day 90, and delete at day 365. This is how production teams cut storage costs by 60-80% without manual intervention.

You can manage S3 buckets and lifecycle policies from the command line using the AWS CLI on Linux. Here is an example lifecycle configuration:

{
  "Rules": [
    {
      "ID": "ArchiveOldLogs",
      "Status": "Enabled",
      "Filter": { "Prefix": "logs/" },
      "Transitions": [
        { "Days": 30, "StorageClass": "STANDARD_IA" },
        { "Days": 90, "StorageClass": "GLACIER" }
      ],
      "Expiration": { "Days": 365 }
    }
  ]
}

When to Use S3

  • Static website hosting and content delivery (pair with CloudFront)
  • Application log storage and analytics data lakes
  • Backup and disaster recovery targets
  • Machine learning training datasets
  • Media file storage (images, videos, documents)
  • Software artifact and build output storage

Amazon EBS – Block Storage for EC2

Amazon Elastic Block Store (EBS) provides persistent block-level storage volumes for EC2 instances. Think of EBS as a virtual hard drive that attaches to your instance over the network. Unlike instance store (ephemeral) volumes, EBS data persists independently of the instance lifecycle – you can stop, terminate, and reattach volumes without losing data.

Each EBS volume exists within a single Availability Zone and replicates automatically within that AZ for durability. You can extend EBS boot disks on AWS without rebooting when you need more space on running instances.

EBS Volume Types

AWS offers four EBS volume types optimized for different performance requirements:

Volume TypeMax IOPSBest For
gp3 (General Purpose SSD)16,000Boot volumes, dev/test, small-medium databases. Baseline 3,000 IOPS free, scale up independently of size
io2 Block Express (Provisioned IOPS SSD)256,000Mission-critical databases (Oracle, SQL Server, SAP HANA). Sub-millisecond latency, 99.999% durability
st1 (Throughput Optimized HDD)500 MiB/s throughputBig data, data warehouses, log processing. Sequential read/write heavy workloads
sc1 (Cold HDD)250 MiB/s throughputInfrequently accessed large datasets. Lowest cost HDD option

For most workloads, gp3 is the right choice. It replaced gp2 as the default and decoupled IOPS provisioning from volume size – you can provision a 100 GB volume with 16,000 IOPS if your database needs it, without paying for storage you don’t use. Only reach for io2 Block Express when you need guaranteed sub-millisecond latency or more than 16,000 IOPS per volume.

EBS Snapshots

EBS snapshots are incremental backups stored in S3 (managed by AWS, not visible in your S3 buckets). The first snapshot copies the full volume, and subsequent snapshots only store changed blocks. This makes snapshots space-efficient even for large volumes.

Key snapshot capabilities:

  • Cross-region copy – replicate snapshots to another region for disaster recovery
  • Snapshot lifecycle policies – automate creation and deletion using Amazon Data Lifecycle Manager
  • Fast Snapshot Restore – pre-warm snapshots so volumes created from them deliver full performance immediately
  • Encryption – snapshots of encrypted volumes are automatically encrypted. You can also encrypt an unencrypted volume by creating an encrypted snapshot copy

For a practical comparison of how EBS snapshots stack up against Azure backup approaches, see Azure Backups vs AWS Snapshots.

When to Use EBS

  • Boot volumes for EC2 instances
  • Relational and NoSQL databases (PostgreSQL, MySQL, MongoDB, Cassandra)
  • Enterprise applications requiring consistent low-latency storage
  • Throughput-intensive big data and log processing (st1 volumes)

Amazon EFS – Shared File Storage

Amazon Elastic File System (EFS) is a fully managed NFS file system that multiple EC2 instances, ECS containers, and Lambda functions can mount simultaneously. Unlike EBS which attaches to a single instance, EFS provides shared access across hundreds or thousands of compute resources at the same time.

EFS scales automatically – you don’t provision capacity. As you add files, the file system grows. As you delete files, it shrinks. Storage is replicated across multiple Availability Zones within a region for durability and high availability.

EFS Performance and Throughput Modes

EFS offers two performance modes and two throughput modes that you select when creating the file system:

Performance modes:

  • General Purpose (default) – lowest latency per operation. Best for web serving, content management, home directories. Supports up to 35,000 read IOPS and 7,000 write IOPS
  • Max I/O – higher aggregate throughput and IOPS, but slightly higher per-operation latency. Use for highly parallelized big data and media processing workloads with dozens of instances

Throughput modes:

  • Elastic (recommended) – automatically scales throughput up or down based on workload. You pay per GiB transferred. Best for unpredictable or spiky workloads
  • Provisioned – specify a fixed throughput level in MiB/s independent of storage size. Use when you know your throughput needs and they exceed what bursting provides
  • Bursting – throughput scales with file system size. Small file systems get burst credits for temporary high throughput. Not recommended for new deployments – Elastic mode is more flexible

EFS Storage Classes

EFS supports lifecycle management between two storage classes:

  • Standard – for frequently accessed files. Multi-AZ redundancy
  • Infrequent Access (IA) – up to 92% lower storage cost, with a per-access retrieval fee. Lifecycle policies automatically move files not accessed for 7, 14, 30, 60, or 90 days
  • One Zone / One Zone-IA – stores data in a single AZ. Lower cost but no cross-AZ redundancy. Good for dev/test or data that can be recreated

When to Use EFS

  • Shared content repositories across multiple web servers
  • Container persistent storage (ECS, EKS, Fargate)
  • Machine learning training where multiple instances read the same dataset
  • Home directories for development teams
  • CI/CD shared build artifacts

Amazon FSx – Managed File Systems

Amazon FSx provides fully managed, high-performance file systems built on four different technologies. Where EFS gives you managed NFS, FSx gives you purpose-built file systems for workloads that need specific protocols or performance characteristics that NFS cannot deliver.

FSx for Windows File Server

A fully managed Windows-native file system built on Windows Server. Supports SMB protocol, NTFS, Active Directory integration, DFS namespaces, and Windows ACLs. This is the go-to choice for Windows workloads that need shared file storage – .NET applications, SQL Server databases, Windows containers, and enterprise applications that require SMB.

Key specs: up to 2 GB/s throughput, hundreds of thousands of IOPS (SSD), supports Multi-AZ deployment for high availability, and integrates with AWS Managed Microsoft AD or self-managed AD.

FSx for Lustre

A fully managed Lustre parallel file system designed for high-performance computing (HPC), machine learning training, and media processing. Lustre delivers sub-millisecond latencies and hundreds of GB/s throughput – the kind of performance needed for genomics analysis, financial modeling, and video rendering.

FSx for Lustre integrates natively with Amazon S3 – you can link a Lustre file system to an S3 bucket and it lazily loads objects as files when first accessed. Results can be written back to S3 automatically. This makes it ideal for processing large S3 datasets with POSIX-compatible applications.

FSx for NetApp ONTAP

A fully managed NetApp ONTAP file system offering multi-protocol access (NFS, SMB, and iSCSI simultaneously). This is the option for organizations running NetApp on-premises who want the same features in AWS – SnapMirror replication, FlexClone volumes, deduplication, compression, and thin provisioning.

ONTAP volumes support automatic tiering between SSD and capacity pool storage, which can reduce costs by up to 60% for datasets with mixed access patterns.

FSx for OpenZFS

A fully managed OpenZFS file system for Linux workloads that need ZFS features – snapshots, clones, data compression, and point-in-time recovery. Delivers up to 1 million IOPS with sub-millisecond latencies. Best for workloads migrating from on-premises ZFS or any Linux application that benefits from instant snapshots and clones (database dev/test environments, CI/CD pipelines).

When to Use FSx

  • FSx for Windows – Windows application workloads, SQL Server, .NET apps, Active Directory environments
  • FSx for Lustre – HPC, ML training, video rendering, genomics, financial simulations
  • FSx for NetApp ONTAP – multi-protocol access (NFS + SMB + iSCSI), NetApp migration, enterprise workloads needing dedup and compression
  • FSx for OpenZFS – Linux workloads needing ZFS features, database dev/test with instant clones

AWS Storage Gateway – Hybrid Cloud Storage

AWS Storage Gateway bridges on-premises environments with AWS cloud storage. It runs as a VM or hardware appliance in your data center and presents cloud storage as local volumes, file shares, or tape drives to your existing applications.

Storage Gateway comes in three modes:

  • S3 File Gateway – presents an NFS or SMB interface backed by S3. Files written locally are stored as objects in S3. Access them through the gateway on-premises or directly from S3 in the cloud. Ideal for migrating file-based workloads or tiering cold data to S3
  • Volume Gateway – presents iSCSI block storage backed by S3 with EBS snapshots. Two sub-modes: cached (primary data in S3, frequently accessed data cached locally) and stored (primary data on-premises, asynchronously backed up to S3). Good for backup, disaster recovery, and migration
  • Tape Gateway – presents a Virtual Tape Library (VTL) to existing backup software (Veeam, Veritas, Commvault). Virtual tapes are stored in S3 and can be archived to S3 Glacier. Drop-in replacement for physical tape infrastructure

Storage Gateway is particularly valuable during cloud migrations – it lets you move data to AWS gradually while keeping on-premises applications running unchanged. If you are running S3-compatible object storage with MinIO on-premises, Storage Gateway can complement that setup by providing a bridge to native AWS services.

AWS Storage Services Comparison

The table below compares all five core AWS storage services across the dimensions that matter most when making architecture decisions:

ServiceStorage TypeProtocol / Access
S3ObjectREST API (HTTP/HTTPS)
EBSBlockAttached to single EC2 instance
EFSFile (NFS)NFSv4.1, multi-attach
FSxFile (multiple)SMB, NFS, Lustre, iSCSI
Storage GatewayHybrid bridgeNFS, SMB, iSCSI, VTL
ServiceMax SizePerformance
S3Unlimited (5 TB per object)High throughput, eventual consistency for overwrites
EBS64 TiB per volumeUp to 256K IOPS (io2), sub-ms latency
EFSNo limit (auto-scales)Up to 35K read IOPS, elastic throughput
FSxVaries: 64 TiB – petabytes (Lustre)Up to 1M IOPS (OpenZFS), hundreds GB/s (Lustre)
Storage GatewayBacked by S3 (unlimited)Depends on local cache and network bandwidth
ServicePricing ModelPrimary Use Case
S3Per GB stored + requests + data transferData lakes, backups, static assets, archives
EBSPer GB provisioned + IOPS (io2)Databases, boot volumes, enterprise apps
EFSPer GB stored (Standard/IA tiers)Shared files, CMS, containers, home dirs
FSxPer GB provisioned + throughputHPC, Windows apps, NetApp migration, ZFS workloads
Storage GatewayPer GB stored in S3 + gateway VM costHybrid cloud, migration, tape replacement

Decision Tree – Choosing the Right AWS Storage Service

Use this decision framework to narrow down the right storage service for your workload:

Start here: What type of access does your application need?

1. API-based access (no mount point needed) – use S3. Your application reads and writes objects over HTTP. This covers data lakes, backup targets, static websites, log storage, and any workload where you access data by key rather than file path.

2. Block device (mount as a disk) – use EBS. Your application needs a formatted file system on a dedicated volume attached to a single EC2 instance. Databases, boot volumes, and any workload that needs consistent low-latency IOPS belong here.

3. Shared file system (multiple instances mount the same path)

  • Linux + NFS only – use EFS. Simplest option for shared Linux file storage. Auto-scales, no capacity planning required
  • Windows + SMB – use FSx for Windows File Server
  • Multi-protocol (NFS + SMB + iSCSI) – use FSx for NetApp ONTAP
  • HPC / extreme throughput – use FSx for Lustre
  • Linux + ZFS features – use FSx for OpenZFS

4. On-premises applications needing cloud storage – use Storage Gateway. Your applications stay on-premises, data flows to AWS transparently.

For organizations evaluating distributed storage alternatives beyond AWS, the comparison of Ceph vs GlusterFS vs MooseFS vs HDFS vs DRBD covers open-source options that can run on-premises or alongside AWS services in hybrid setups.

Cost Optimization Tips

A few practical rules to keep AWS storage costs under control:

  • Use S3 Intelligent-Tiering for data with unpredictable access patterns instead of guessing which tier to use
  • Right-size EBS volumes – gp3 lets you set IOPS independently of volume size, so don’t over-provision storage just to get more IOPS
  • Enable EFS Lifecycle Management – move files to IA storage class after 30 days to save up to 92% on infrequently accessed data
  • Delete unused EBS snapshots – orphaned snapshots from terminated instances accumulate quietly and increase your bill
  • Tag everything – use AWS Cost Explorer with tags to identify which teams or projects drive storage costs

Conclusion

AWS storage services are purpose-built – S3 for objects, EBS for block devices, EFS for shared NFS, FSx for specialized file systems, and Storage Gateway for hybrid bridging. The right choice depends on your access pattern (API vs mount vs multi-attach), performance requirements (IOPS vs throughput), protocol needs (HTTP vs NFS vs SMB vs Lustre), and cost sensitivity.

For production deployments, combine multiple services – databases on EBS with backups to S3, application shared files on EFS, and archive data in S3 Glacier. Enable lifecycle policies across all services to automate tiering and avoid paying premium rates for data nobody accesses. Monitor storage costs monthly using AWS Cost Explorer and set billing alerts before costs surprise you.

Related Articles

Storage Configure and Mount Hetzner Storage box on Linux using Ansible Kubernetes Run Ceph toolbox for Rook on Kubernetes / OpenShift Cloud Install AWS SSM Agent on EC2 Instances Cloud Best Practices for Developers On How to Secure Cloud-Based Applications

Leave a Comment

Press ESC to close