Amazon S3 — Object Storage

AWS-S3

How S3 stores unlimited objects across storage classes, handles versioning and replication, and secures data at rest.

awss3object-storagestorage-classesencryptionversioningreplication

Overview

Amazon S3 (Simple Storage Service) is object storage — a fundamentally different model from block storage (EBS) and file storage (EFS). In object storage, each piece of data is a flat, self-contained object with a key, a value (the data), and metadata. There is no hierarchy, no filesystem, no inodes. The slash in a key like logs/2025/01/app.log is just a character in a string — S3 uses it to simulate folder-like prefixes in the console, but the underlying model has no directories.

S3 stores objects in buckets. Bucket names are globally unique across all AWS accounts worldwide. A bucket lives in a single AWS region. Objects can be up to 5 TB in size. There is no limit on the number of objects or total storage in a bucket.


Durability and Availability

S3 Standard provides 11 nines of durability — 99.999999999%. AWS achieves this by storing each object redundantly across a minimum of three Availability Zones within a region. Losing data from S3 Standard requires simultaneous, independent failure across three physically separated data centres — a scenario AWS designs against with multiple layers of hardware redundancy and continuous integrity checking.

Availability (the ability to retrieve objects) is a separate metric. S3 Standard offers 99.99% availability. Lower-cost storage classes trade availability (and sometimes durability) for cost reduction.


Storage Classes

ClassMin Storage DurationRetrieval FeeAZsRetrieval LatencyTypical Use Case
StandardNoneNone3+MillisecondsFrequently accessed data
Intelligent-Tiering30 days (per tier)None (auto-managed)3+Milliseconds–hours (tier dependent)Unknown or changing access patterns
Standard-IA30 daysPer-GB3+MillisecondsInfrequent but immediately needed
One Zone-IA30 daysPer-GB1MillisecondsReproducible infrequent data
Glacier Instant Retrieval90 daysPer-GB3+MillisecondsArchive needing instant access
Glacier Flexible Retrieval90 daysPer-GB3+1–5 min / 3–5 hr / 5–12 hrArchive, flexible retrieval
Glacier Deep Archive180 daysPer-GB3+12 hr / 48 hrCompliance archive, rarely accessed

One Zone-IA is the only class that does not replicate across 3+ AZs. It is appropriate only for data that can be recreated (re-generated thumbnails, replicated backups from another region) — losing the AZ means losing the data.

Glacier Flexible Retrieval offers three retrieval tiers: Expedited (1–5 minutes, higher cost), Standard (3–5 hours), and Bulk (5–12 hours, cheapest). Most archive workflows use Bulk retrieval.


S3 Intelligent-Tiering

Intelligent-Tiering monitors object access patterns and automatically moves objects between tiers with no performance impact and no retrieval fees. AWS charges a small per-object monitoring fee (~$0.0025 per 1,000 objects/month).

Tiers within Intelligent-Tiering:

TierActivationDescription
Frequent AccessDefaultObjects accessed within 30 days
Infrequent AccessAfter 30 days without accessAutomatic
Archive Instant AccessAfter 90 daysAutomatic
Archive AccessConfigurable (90–730 days)Must be opted in; same as Glacier Flexible
Deep Archive AccessConfigurable (180–730 days)Must be opted in; same as Glacier Deep Archive

For the Archive Access and Deep Archive Access tiers, objects are moved asynchronously and retrieval times match Glacier Flexible and Glacier Deep Archive respectively. These tiers must be explicitly enabled on the Intelligent-Tiering configuration.


Lifecycle Policies

Lifecycle policies automate storage class transitions and object expiration. Rules are applied at the bucket or prefix/tag level.

A typical lifecycle rule:

  1. Day 0: Object uploaded to Standard
  2. Day 30: Transition to Standard-IA
  3. Day 90: Transition to Glacier Flexible Retrieval
  4. Day 365: Expire (permanently delete) object

Transitions can also target specific object versions. A common pattern for versioned buckets: expire non-current versions after 30 days and permanently delete expired delete markers.


Versioning

When versioning is enabled on a bucket, S3 preserves every version of every object. Uploading a new version of report.pdf does not overwrite the previous version — both exist with different version IDs.

Deleting a versioned object without specifying a version ID places a delete marker — a special zero-byte placeholder — as the current version. The object appears deleted to clients that don’t specify a version ID. All previous versions remain stored and accessible by their version ID. To truly remove the data, you must explicitly delete each version by its version ID.

Once enabled, versioning cannot be fully disabled — it can only be suspended. Suspended buckets stop creating new versions but preserve all existing versions.

MFA Delete

MFA Delete adds a second authentication factor to two specific operations:

Only the root account can enable MFA Delete. Once enabled, these operations require the root user to provide an MFA token with each API call. This protects against a compromised IAM credential deleting all object versions.


Replication

S3 replication copies objects from a source bucket to one or more destination buckets.

Both require versioning enabled on source and destination. Replication is asynchronous. New objects uploaded after replication is configured are replicated. Existing objects require S3 Batch Replication — a separate job-based mechanism.

Replication rules can filter by prefix or object tag. Delete marker replication is optional — by default, deletes in the source are not propagated to the destination (to prevent accidental mass deletion).


Encryption

Encryption at Rest

MethodKey ManagementNotes
SSE-S3AWS-managed, per-object AES-256Transparent, no extra cost, default since 2023
SSE-KMSAWS KMS Customer Managed Key (CMK)CloudTrail audit trail per operation, key rotation, cross-account control
SSE-CCustomer-provided per-requestHTTPS required; AWS encrypts/decrypts but never stores the key
DSSE-KMSTwo independent KMS encryption layersFor regulatory requirements mandating dual-layer encryption

SSE-KMS adds meaningful security controls over SSE-S3: you can audit every decrypt operation in CloudTrail, disable the CMK to revoke access to all encrypted objects, set key rotation, and control cross-account access at the key level. The tradeoff is KMS API cost ($0.03 per 10,000 requests) and KMS rate limits.

Bucket default encryption applies to objects uploaded without specifying an encryption method. A bucket policy can enforce encryption by denying s3:PutObject calls that do not include the required x-amz-server-side-encryption header.


Object Lock

Object Lock enforces WORM (Write Once Read Many) behaviour. Locked objects cannot be overwritten or deleted until their retention period expires.

Retention modes:

Legal Hold: Independent of retention period. Applies WORM protection indefinitely until explicitly removed. Uses the s3:PutObjectLegalHold permission.

Object Lock must be enabled at bucket creation. It automatically enables versioning.


Access Control

Bucket Policies

Bucket policies are JSON documents attached to a bucket, using the same IAM policy syntax. They can grant or deny:

Bucket policies are evaluated alongside IAM identity policies. For cross-account access, both the bucket policy and the caller’s IAM policy must permit the action.

S3 Block Public Access

Block Public Access settings override bucket policies and ACLs for any public access grant. Four settings, each independently togglable:

AWS enables all four by default for new buckets. If you need public access (static website, public CDN origin), you must explicitly disable the relevant settings.

ACLs

Access Control Lists are a legacy access mechanism predating bucket policies. AWS now recommends disabling ACLs (Object Ownership: Bucket Owner Enforced) so that all access is managed through bucket policies and IAM. New buckets default to ACLs disabled.

Presigned URLs

A presigned URL is a time-limited URL that grants temporary access to a private S3 object. The URL is generated using the signing credentials of an IAM identity. Any holder of the URL can perform the specific operation (GET or PUT) until the URL expires, without needing their own AWS credentials.

Use cases: allow a user to download a private file through an application without exposing the bucket, allow direct upload from a browser to S3 without proxying through a server.


Static Website Hosting

Enable static website hosting on a bucket to serve content via an S3 website endpoint (http://bucket.s3-website-region.amazonaws.com). Configure:

For HTTPS on a custom domain, serve through CloudFront with an Origin Access Control (OAC) — CloudFront provides the TLS certificate while keeping the bucket private. The bucket policy then grants access only to the CloudFront distribution’s OAC.


S3 Select and Object Lambda

S3 Select executes SQL expressions directly against the content of S3 objects (CSV, JSON, Parquet, compressed with GZIP or BZIP2). Instead of downloading a 10 GB CSV to extract 100 rows, S3 Select evaluates the query server-side and returns only matching data. AWS reports up to 400% faster and 80% cheaper queries compared to full-object retrieval for selective queries.

S3 Object Lambda attaches an AWS Lambda function to an S3 GET request path. When an object is retrieved, Lambda receives the original object, transforms it (redact PII, convert format, add watermark), and returns the modified response — without storing a second copy of the transformed object. The client receives the transformed data; the original object in S3 is unchanged.


Performance

S3 supports 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix. A prefix is any string up to the last / in the key. Objects spread across multiple prefixes multiply total throughput: 10 prefixes yield 55,000 GET requests/second.

Multipart upload splits large objects into parts uploaded in parallel, then assembled by S3. Required for objects over 5 GB. Recommended for objects over 100 MB. Benefits: parallelism (faster upload), resume on failure (only failed parts need retry), individual part MD5 checksums.

Transfer Acceleration routes uploads through the nearest CloudFront edge location, then uses AWS’s private network backbone to S3. Useful when uploading from geographically distant clients. The S3 Transfer Acceleration endpoint is bucket.s3-accelerate.amazonaws.com.

Application
Amazon S3
PUT object (multipart if > 100 MB)
Lands in Standard storage class
Versioning: new version ID assigned
Previous version preserved
Day 30: transition to Standard-IA
Object moved, same key and version ID
Day 90: transition to Glacier
Object moved, Glacier Flexible Retrieval
Object stored in Glacier tier
Retrieval: 1–5 min / 3–5 hr / 5–12 hr
Day 365: expire (delete)
Delete marker created if versioning enabled

References