Overview
Cloud costs are not fixed. AWS charges based on what you provision and consume. A well-architected environment running the same workload as a poorly architected one might cost 30–60% less — not from negotiation, but from engineering decisions.
Unmanaged cloud spending grows in two ways: proportionally (more usage, more cost) and superlinearly (poor architectural choices amplify cost faster than usage grows). A team that sends all inter-AZ traffic through a NAT Gateway, stores cold data in S3 Standard, and runs oversized On-Demand instances will pay dramatically more than a team that addresses each of these individually.
Cost optimization is not a one-time activity performed before a launch. It is an ongoing engineering discipline — measure, right-size, select the right purchasing model, eliminate waste, repeat.
AWS Pricing Fundamentals
AWS pricing operates across three cost dimensions for almost every service:
| Dimension | Description | Examples |
|---|---|---|
| Compute | Instance hours, vCPU-seconds, request counts | EC2 instance-hours, Lambda GB-seconds, API Gateway requests |
| Storage | GB-month stored, IOPS provisioned, requests made | S3 GB-month, EBS GB-month, EBS provisioned IOPS |
| Data transfer | Bytes moved between services, regions, or the internet | EC2 egress to internet, cross-region replication, cross-AZ traffic |
These three dimensions interact. An architecture that minimizes compute might increase data transfer costs. Understanding all three before making design decisions prevents cost surprises.
Pricing Models
AWS offers five pricing models for compute. Most workloads use a combination:
| Model | Discount vs On-Demand | Commitment | Use Case |
|---|---|---|---|
| On-Demand | 0% — baseline price | None | Unpredictable workloads, short experiments, initial sizing |
| Savings Plans | 20–66% | 1 or 3 year $/hr commitment | Predictable baseline — most common choice |
| Reserved Instances | 40–72% | 1 or 3 year instance commitment | Predictable workloads requiring a specific instance type |
| Spot Instances | Up to 90% | None (interruptible) | Fault-tolerant batch, stateless workers, CI/CD |
| Free Tier | 100% up to limits | None | Exploration, learning, small projects |
AWS Free Tier
Two categories of free tier exist:
12-month free tier (expires 12 months after account creation):
- EC2: 750 hours/month of t2.micro or t3.micro
- S3: 5 GB of storage + 20,000 GET requests + 2,000 PUT requests
- RDS: 750 hours/month of db.t2.micro or db.t3.micro, 20 GB storage
Always-free (never expires, regardless of account age):
- Lambda: 1 million requests/month + 400,000 GB-seconds compute
- DynamoDB: 25 GB storage + 25 WCU + 25 RCU (provisioned)
- CloudFront: 1 TB data transfer out + 10 million HTTP/HTTPS requests
- CloudWatch: 10 custom metrics + 10 alarms + 1 million API requests
Always-free tiers make Lambda, DynamoDB, and CloudFront genuinely free for low-traffic applications — not just during a trial period.
Data Transfer Costs — The Hidden Expense
Data transfer is frequently the most surprising AWS cost. The rules are asymmetric and non-obvious:
| Traffic Path | Cost |
|---|---|
| Inbound (ingress) from internet to AWS | Free |
| Outbound (egress) from EC2/S3 to internet | Charged per GB — first 100 GB/month free, then tiered |
| Between AWS regions | Charged per GB in both directions |
| Between AZs in the same region | Charged per GB ($0.01/GB each direction) |
| Within the same AZ (private IP) | Free |
| CloudFront to internet (egress) | Charged per GB — cheaper rate than EC2 direct egress |
| S3 to CloudFront in same region | Free (CloudFront origin fetch from S3 is free) |
The cross-AZ charge catches many teams off guard. An EC2 instance in us-east-1a querying an RDS instance in us-east-1b pays $0.01/GB in each direction — $0.02/GB round trip. For a high-throughput database application, this adds up quickly. Solutions:
- Deploy EC2 and RDS in the same AZ for latency-sensitive applications (accept the reduced fault tolerance, or replicate to another AZ and use read replicas locally)
- Use a Multi-AZ RDS with the writer in the same AZ as the primary application tier
- Cache aggressively with ElastiCache deployed in the same AZ as the application
CloudFront reduces egress costs: CloudFront’s per-GB egress rate is lower than EC2 direct egress. For high-traffic content (static assets, APIs, downloads), serving through CloudFront reduces both latency and cost simultaneously. S3 → CloudFront → internet is typically the cheapest path for static content delivery.
EC2 Purchasing Options in Depth
On-Demand
Highest per-hour price. No commitment, no interruption risk. On-Demand is the right choice for:
- Workloads where duration or instance type is unknown
- Short-lived experiments or proof-of-concept
- Burst capacity above committed baseline
- Any workload running less than the break-even period for a Savings Plan commitment (~30–40% of the year)
Savings Plans (Preferred Over Reserved Instances for Most Use Cases)
Savings Plans commit to a consistent dollar-per-hour spend in exchange for a discount. They are more flexible than Reserved Instances because they apply automatically to matching usage — you do not commit to a specific instance type.
Compute Savings Plans:
- Applies to EC2 (any instance family, any size, any region, any OS), AWS Fargate, and Lambda
- Maximum flexibility — a commitment made for x86 instances automatically covers ARM (Graviton) instances
- Discount: approximately 54% off On-Demand for a 3-year no-upfront commitment
EC2 Instance Savings Plans:
- Applies to a specific instance family in a specific region (e.g., m5 in us-east-1)
- Covers all sizes (m5.large through m5.24xlarge) and all OS within that family
- Higher discount: approximately 66% off On-Demand for 3-year no-upfront
- Less flexible than Compute Savings Plans — does not cover Fargate or Lambda
Payment options: No Upfront (monthly payments, smallest commitment, smallest discount), Partial Upfront (some paid upfront, rest monthly, medium discount), All Upfront (full 1 or 3 year commitment paid upfront, maximum discount).
Reserved Instances
Reserved Instances (RIs) commit to a specific instance configuration in exchange for a significant discount. Two types:
| RI Type | Flexibility | Discount | Notes |
|---|---|---|---|
| Standard RI | Locked to family, size, region, OS, tenancy | Up to 72% | Can be listed on the RI Marketplace to sell unused capacity |
| Convertible RI | Can exchange for different family/size/OS/region | Up to 54% | Cannot sell on Marketplace |
RIs are most valuable for:
- Specific instance types that you know will run continuously (e.g., a dedicated database server)
- Situations where you need the ability to sell unused capacity (Standard RI Marketplace)
- Older accounts or tooling that manages RIs specifically
For most greenfield workloads, Savings Plans are preferred because they automatically apply across instance families, regions, and services as your architecture evolves.
Spot Instances
Spot Instances use AWS’s spare EC2 capacity at up to 90% discount. The trade-off: AWS can reclaim a Spot Instance with a 2-minute warning when the capacity is needed for On-Demand customers.
Designing for Spot:
| Design Principle | Implementation |
|---|---|
| Stateless workers | Store state in S3, DynamoDB, or SQS — not on the instance |
| Diversification | Request multiple instance types and AZs via Spot Fleet — if one pool is reclaimed, others continue |
| Interruption handling | Use the 2-minute notice (available via instance metadata) to checkpoint state and drain gracefully |
| Spot Advisor | Use the Spot Instance Advisor to select pools with < 5% interruption frequency |
| Mixed fleets | Combine On-Demand base capacity with Spot for burst — Auto Scaling Group with mixed instance policies |
Spot is ideal for: batch data processing, CI/CD build workers, ML training jobs, rendering farms, large-scale testing.
Right-Sizing
Over-provisioned instances are the single most common source of EC2 waste. CPU utilization averaging 5–10% on a large instance means you are paying for 90–95% of the instance’s capacity without using it.
AWS Compute Optimizer
Compute Optimizer analyzes CloudWatch metrics (CPU, network, disk I/O, and memory if the CloudWatch Agent is installed) over a 14-day window and recommends right-sized alternatives:
- Recommends same or smaller instance types with equivalent or better performance
- Flags over-provisioned and under-provisioned instances
- Covers EC2 instances, Auto Scaling groups, ECS tasks on Fargate, Lambda functions, and EBS volumes
- Available at no additional cost (CloudWatch metrics are the input — standard monitoring costs apply)
Memory data requires the CloudWatch Agent. Without it, Compute Optimizer cannot see memory utilization and may miss over-provisioned instances where CPU is low but memory is high (common for Java applications).
Graviton Instances
AWS Graviton (ARM-based) processors deliver 20–40% better price/performance than equivalent x86 instances for most general-purpose workloads:
| x86 Instance | Graviton Equivalent | Typical Savings |
|---|---|---|
| m5.large | m6g.large | ~20% |
| c5.xlarge | c6g.xlarge | ~20–30% |
| r5.2xlarge | r6g.2xlarge | ~20% |
Migration requires the application to build for ARM64. AWS-managed runtimes (Lambda, ECS on Fargate, RDS, ElastiCache) support Graviton natively with no code change — selecting a Graviton instance type in the console is sufficient. For custom application code, recompile for ARM64.
Storage Right-Sizing
EBS volume right-sizing is frequently overlooked:
| Action | Savings |
|---|---|
| Migrate gp2 → gp3 | gp3 is cheaper than gp2 at the same size. gp3 IOPS and throughput are configurable independently — no need to over-size volume to get IOPS. |
| Delete unattached EBS volumes | Unattached volumes continue to accrue storage charges. Snapshot first, then delete. |
| Rightsize provisioned IOPS (io1/io2) | io1 and io2 charge per provisioned IOPS regardless of actual IOPS consumed. Right-size to actual peak IOPS. |
| EBS Snapshot lifecycle policies | Old snapshots accumulate at $0.05/GB/month. Implement Data Lifecycle Manager policies to expire snapshots. |
Cost Visibility Tools
AWS Cost Explorer
Cost Explorer is the primary cost analysis tool. It provides:
- Spend by dimension: Break down costs by service, account, region, availability zone, instance type, purchase option, or resource tag
- Time granularity: Daily, monthly, or hourly views (hourly costs available for the last 14 days)
- Forecasting: ML-based cost forecasting for the next 12 months based on historical patterns
- RI/SP coverage reports: What percentage of your EC2 usage is covered by Savings Plans or RIs? Identify uncovered usage that is paying full On-Demand rates.
- RI/SP utilization reports: Are you fully utilizing the commitments you purchased? Low utilization means wasted commitment spend.
Cost Explorer data is available within 24 hours of incurring charges. It is the right tool for trend analysis, cost attribution, and planning purchasing commitments.
AWS Budgets
Budgets set spending thresholds and alert when actual or forecasted costs approach or exceed the threshold:
| Budget Type | What It Tracks |
|---|---|
| Cost budget | Total spend in dollars |
| Usage budget | Quantity consumed (e.g., EC2 hours, S3 GB) |
| RI utilization budget | Alert when RI utilization drops below a threshold |
| Savings Plans utilization budget | Alert when SP utilization drops below a threshold |
| Savings Plans coverage budget | Alert when covered usage drops below a threshold |
Budget Actions: When a budget threshold is crossed, Budgets can automatically apply an IAM policy or SCP to restrict further spending. Example: when a developer sandbox account exceeds $500 for the month, automatically attach an SCP that prevents launching new EC2 instances. This enforces cost accountability without requiring manual intervention.
AWS Cost and Usage Report (CUR)
The CUR is the most detailed billing data source available:
- Line-item granularity: Every usage line item — individual EC2 instance-hours, S3 GET requests, data transfer GB
- Resource-level attribution: Link charges to specific resource IDs (not just service averages)
- Delivery: Delivered to an S3 bucket in Parquet or CSV format, updated multiple times daily
- Analysis: Query with Athena (for ad-hoc SQL) or load into Redshift (for complex reporting). Most third-party cost management platforms ingest the CUR as their primary data source.
CUR is the foundation for serious cost analytics. Cost Explorer is convenient but has limits on granularity and filtering. CUR has none of those limits — you can run any SQL query against the full billing data.
AWS Cost Anomaly Detection
Cost Anomaly Detection uses ML to establish a baseline spending pattern for each service, account, or tag and alerts when actual spending deviates unexpectedly:
- Monitors continuously — not threshold-based like Budgets
- Alerts on unexpected spikes before end-of-month when a manual review would catch them
- Configurable monitors: by service (e.g., monitor EC2 separately from S3), by account, by cost allocation tag (e.g., monitor each team’s tagged resources independently)
- Alert delivery: SNS, email, or Slack via SNS integration
- Example: a misconfigured Auto Scaling group launches 500 instances instead of 5 — Anomaly Detection fires within hours, long before end-of-month billing review
AWS Trusted Advisor
Trusted Advisor checks your AWS environment against best practices across five categories: cost optimization, performance, security, fault tolerance, and service quotas.
Cost optimization checks:
| Check | What It Finds |
|---|---|
| Idle EC2 instances | Instances with < 10% average CPU for 4 days and low network traffic |
| Underutilized EBS volumes | Volumes with < 1 IOPS/day for 7 days — likely unused |
| Unassociated Elastic IP addresses | EIPs not attached to any running instance — charged at $0.005/hr |
| RIs expiring soon | Reserved Instances expiring in the next 30 days — renew or allow to revert to On-Demand |
| Low RI utilization | RIs used < 80% over the past 30 days — potential waste |
Security checks (relevant to cost — a compromised account often drives unexpected spend):
| Check | What It Finds |
|---|---|
| Security groups with unrestricted access | Ports open to 0.0.0.0/0 |
| MFA on root account | Root account without MFA |
| S3 bucket permissions | Buckets with public access |
| Public RDS snapshots | Snapshots accessible to any AWS account |
Full Trusted Advisor access (all checks, programmatic access) requires a Business or Enterprise Support plan. The Developer Support plan and free tier provide access to a limited subset of checks.
Architectural Cost Patterns
Architectural decisions have more impact on cost than purchasing model choices. Choosing the wrong architecture and then buying Savings Plans for it still leaves substantial waste.
Use Managed Services (Total Cost of Ownership)
Self-managed EC2 databases appear cheaper per hour but require:
- Manual backup configuration, testing, and monitoring
- OS and database engine patching
- Multi-AZ replication setup and failover testing
- Scaling — add read replicas, resize instances, storage auto-scaling
- Expert operational knowledge
Amazon RDS costs more per hour than an equivalent EC2 instance. But it delivers automated backups, Multi-AZ failover, read replica provisioning, storage auto-scaling, and engine patching. For most organizations, the total cost of ownership including engineering time and operational risk favors RDS.
The same logic applies to ElastiCache vs self-managed Redis/Memcached, MSK vs self-managed Kafka, and OpenSearch Service vs self-managed Elasticsearch.
S3 Storage Lifecycle Management
S3 Standard charges for every byte stored every month. Not all data is accessed every month:
| S3 Storage Class | Use Case | Cost vs Standard |
|---|---|---|
| S3 Standard | Frequently accessed (daily) | Baseline |
| S3 Standard-IA | Infrequent access (monthly) | ~60% cheaper storage, retrieval fee applies |
| S3 Glacier Instant Retrieval | Archive accessed occasionally (quarterly) | ~80% cheaper, millisecond retrieval |
| S3 Glacier Flexible Retrieval | Archive, retrieval in minutes to hours | ~90% cheaper |
| S3 Glacier Deep Archive | Long-term archive, 12-hour retrieval | ~95% cheaper |
S3 Intelligent-Tiering: Automatically moves objects between Standard and Infrequent Access tiers based on access patterns. Small monthly monitoring fee per object. Eliminates the need to predict access patterns — suitable for data lakes and application storage where access patterns are unpredictable.
Lifecycle rules: Explicit time-based transitions without the monitoring fee:
- Day 0: Upload to S3 Standard
- Day 30: Transition to Standard-IA
- Day 90: Transition to Glacier Instant Retrieval
- Day 365: Transition to Glacier Deep Archive
- Day 2555 (7 years): Expire (delete)
Layered Purchasing Model for Auto Scaling Workloads
A common pattern for variable workloads:
Total capacity = Savings Plan baseline + On-Demand burst + Spot for batch
Example: a web application that needs 20 instances at night and 80 instances midday
- Purchase Savings Plans covering 20 instances (always running)
- On-Demand covers midday burst from 20 to 80 instances (no commitment, flexible)
- Spot Fleet runs nightly batch processing jobs (not the web tier — Spot is for fault-tolerant work)
This approach maximizes savings on the predictable portion while retaining flexibility for the variable portion and using Spot for appropriate batch work.
Eliminate Waste
| Waste Type | Detection | Remediation |
|---|---|---|
| Unattached EBS volumes | Trusted Advisor, Config rule | Snapshot + delete |
| Idle EC2 instances | Trusted Advisor, Compute Optimizer | Terminate or stop |
| Unassociated Elastic IPs | Trusted Advisor | Release |
| Old EBS snapshots | Cost Explorer, DLM audit | Expire via DLM lifecycle policy |
| Idle RDS instances | Trusted Advisor, CloudWatch (0 connections) | Stop (temporary) or delete |
| Large EC2 instances doing small work | Compute Optimizer | Right-size |
| Cross-AZ data transfer | VPC Flow Logs analysis | Collocate compute and data in same AZ |
Tagging Strategy for Cost Attribution
Without resource tags, you cannot attribute spending to business units, teams, products, or environments. Cost Explorer can break down by service but not by “which team caused this EC2 spend.”
Mandatory tag strategy:
| Tag Key | Example Values | Purpose |
|---|---|---|
Environment | prod, staging, dev, sandbox | Separate production vs non-production costs |
Team | platform, data, frontend, backend | Attribute spending to engineering teams |
Project | project-x, project-y | Track project-specific costs |
CostCenter | 1001, 1002, 1003 | Finance attribution for chargeback or showback |
Enable cost allocation tags in the Billing console after applying tags to resources. Tags appear in Cost Explorer breakdowns and in the CUR within 24 hours of activation.
Enforce tagging compliance with AWS Config custom rules — flag any EC2 instance, RDS instance, or S3 bucket missing required tags as NON_COMPLIANT. Combine with Budget Actions to stop non-tagged instances.
Cost Optimization Framework
The AWS Cost Optimization Pillar from the Well-Architected Framework defines five practices:
- Practice Cloud Financial Management: Establish a cloud cost function, define ownership, implement chargeback or showback, hold teams accountable.
- Expenditure and usage awareness: Tag resources, use Cost Explorer and CUR, set Budgets, review regularly.
- Cost-effective resources: Select the right service and instance type. Match compute to workload characteristics. Use Graviton where applicable.
- Manage demand and supply resources: Auto Scaling matches supply to demand. Avoid over-provisioning for peak — scale instead.
- Optimize over time: As AWS releases new services and instance types, evaluate whether migrating reduces cost. Revisit purchasing model commitments annually.
Applied as a cycle:
Measure: CloudWatch → Cost Explorer → CUR → Trusted Advisor
Identify: Over-provisioned resources, unoptimized purchasing, architectural waste
Act: Right-size, purchase Savings Plans, apply lifecycle policies, eliminate unused resources
Verify: Measure again — confirm cost reduction without performance impact
Repeat: Cost optimization never ends; AWS releases better options continuously