AWS Cost Management & Optimization

Overview

Cloud costs are not fixed. AWS charges based on what you provision and consume. A well-architected environment running the same workload as a poorly architected one might cost 30–60% less — not from negotiation, but from engineering decisions.

Unmanaged cloud spending grows in two ways: proportionally (more usage, more cost) and superlinearly (poor architectural choices amplify cost faster than usage grows). A team that sends all inter-AZ traffic through a NAT Gateway, stores cold data in S3 Standard, and runs oversized On-Demand instances will pay dramatically more than a team that addresses each of these individually.

Cost optimization is not a one-time activity performed before a launch. It is an ongoing engineering discipline — measure, right-size, select the right purchasing model, eliminate waste, repeat.

AWS Pricing Fundamentals

AWS pricing operates across three cost dimensions for almost every service:

Dimension	Description	Examples
Compute	Instance hours, vCPU-seconds, request counts	EC2 instance-hours, Lambda GB-seconds, API Gateway requests
Storage	GB-month stored, IOPS provisioned, requests made	S3 GB-month, EBS GB-month, EBS provisioned IOPS
Data transfer	Bytes moved between services, regions, or the internet	EC2 egress to internet, cross-region replication, cross-AZ traffic

These three dimensions interact. An architecture that minimizes compute might increase data transfer costs. Understanding all three before making design decisions prevents cost surprises.

Pricing Models

AWS offers five pricing models for compute. Most workloads use a combination:

Model	Discount vs On-Demand	Commitment	Use Case
On-Demand	0% — baseline price	None	Unpredictable workloads, short experiments, initial sizing
Savings Plans	20–66%	1 or 3 year $/hr commitment	Predictable baseline — most common choice
Reserved Instances	40–72%	1 or 3 year instance commitment	Predictable workloads requiring a specific instance type
Spot Instances	Up to 90%	None (interruptible)	Fault-tolerant batch, stateless workers, CI/CD
Free Tier	100% up to limits	None	Exploration, learning, small projects

AWS Free Tier

Two categories of free tier exist:

12-month free tier (expires 12 months after account creation):

EC2: 750 hours/month of t2.micro or t3.micro
S3: 5 GB of storage + 20,000 GET requests + 2,000 PUT requests
RDS: 750 hours/month of db.t2.micro or db.t3.micro, 20 GB storage

Always-free (never expires, regardless of account age):

Lambda: 1 million requests/month + 400,000 GB-seconds compute
DynamoDB: 25 GB storage + 25 WCU + 25 RCU (provisioned)
CloudFront: 1 TB data transfer out + 10 million HTTP/HTTPS requests
CloudWatch: 10 custom metrics + 10 alarms + 1 million API requests

Always-free tiers make Lambda, DynamoDB, and CloudFront genuinely free for low-traffic applications — not just during a trial period.

Data Transfer Costs — The Hidden Expense

Data transfer is frequently the most surprising AWS cost. The rules are asymmetric and non-obvious:

Traffic Path	Cost
Inbound (ingress) from internet to AWS	Free
Outbound (egress) from EC2/S3 to internet	Charged per GB — first 100 GB/month free, then tiered
Between AWS regions	Charged per GB in both directions
Between AZs in the same region	Charged per GB ($0.01/GB each direction)
Within the same AZ (private IP)	Free
CloudFront to internet (egress)	Charged per GB — cheaper rate than EC2 direct egress
S3 to CloudFront in same region	Free (CloudFront origin fetch from S3 is free)

The cross-AZ charge catches many teams off guard. An EC2 instance in us-east-1a querying an RDS instance in us-east-1b pays $0.01/GB in each direction — $0.02/GB round trip. For a high-throughput database application, this adds up quickly. Solutions:

Deploy EC2 and RDS in the same AZ for latency-sensitive applications (accept the reduced fault tolerance, or replicate to another AZ and use read replicas locally)
Use a Multi-AZ RDS with the writer in the same AZ as the primary application tier
Cache aggressively with ElastiCache deployed in the same AZ as the application

CloudFront reduces egress costs: CloudFront’s per-GB egress rate is lower than EC2 direct egress. For high-traffic content (static assets, APIs, downloads), serving through CloudFront reduces both latency and cost simultaneously. S3 → CloudFront → internet is typically the cheapest path for static content delivery.

EC2 Purchasing Options in Depth

On-Demand

Highest per-hour price. No commitment, no interruption risk. On-Demand is the right choice for:

Workloads where duration or instance type is unknown
Short-lived experiments or proof-of-concept
Burst capacity above committed baseline
Any workload running less than the break-even period for a Savings Plan commitment (~30–40% of the year)

Savings Plans (Preferred Over Reserved Instances for Most Use Cases)

Savings Plans commit to a consistent dollar-per-hour spend in exchange for a discount. They are more flexible than Reserved Instances because they apply automatically to matching usage — you do not commit to a specific instance type.

Compute Savings Plans:

Applies to EC2 (any instance family, any size, any region, any OS), AWS Fargate, and Lambda
Maximum flexibility — a commitment made for x86 instances automatically covers ARM (Graviton) instances
Discount: approximately 54% off On-Demand for a 3-year no-upfront commitment

EC2 Instance Savings Plans:

Applies to a specific instance family in a specific region (e.g., m5 in us-east-1)
Covers all sizes (m5.large through m5.24xlarge) and all OS within that family
Higher discount: approximately 66% off On-Demand for 3-year no-upfront
Less flexible than Compute Savings Plans — does not cover Fargate or Lambda

Payment options: No Upfront (monthly payments, smallest commitment, smallest discount), Partial Upfront (some paid upfront, rest monthly, medium discount), All Upfront (full 1 or 3 year commitment paid upfront, maximum discount).

Reserved Instances

Reserved Instances (RIs) commit to a specific instance configuration in exchange for a significant discount. Two types:

RI Type	Flexibility	Discount	Notes
Standard RI	Locked to family, size, region, OS, tenancy	Up to 72%	Can be listed on the RI Marketplace to sell unused capacity
Convertible RI	Can exchange for different family/size/OS/region	Up to 54%	Cannot sell on Marketplace

RIs are most valuable for:

Specific instance types that you know will run continuously (e.g., a dedicated database server)
Situations where you need the ability to sell unused capacity (Standard RI Marketplace)
Older accounts or tooling that manages RIs specifically

For most greenfield workloads, Savings Plans are preferred because they automatically apply across instance families, regions, and services as your architecture evolves.

Spot Instances

Spot Instances use AWS’s spare EC2 capacity at up to 90% discount. The trade-off: AWS can reclaim a Spot Instance with a 2-minute warning when the capacity is needed for On-Demand customers.

Designing for Spot:

Design Principle	Implementation
Stateless workers	Store state in S3, DynamoDB, or SQS — not on the instance
Diversification	Request multiple instance types and AZs via Spot Fleet — if one pool is reclaimed, others continue
Interruption handling	Use the 2-minute notice (available via instance metadata) to checkpoint state and drain gracefully
Spot Advisor	Use the Spot Instance Advisor to select pools with < 5% interruption frequency
Mixed fleets	Combine On-Demand base capacity with Spot for burst — Auto Scaling Group with mixed instance policies

Spot is ideal for: batch data processing, CI/CD build workers, ML training jobs, rendering farms, large-scale testing.

Right-Sizing

Over-provisioned instances are the single most common source of EC2 waste. CPU utilization averaging 5–10% on a large instance means you are paying for 90–95% of the instance’s capacity without using it.

AWS Compute Optimizer

Compute Optimizer analyzes CloudWatch metrics (CPU, network, disk I/O, and memory if the CloudWatch Agent is installed) over a 14-day window and recommends right-sized alternatives:

Recommends same or smaller instance types with equivalent or better performance
Flags over-provisioned and under-provisioned instances
Covers EC2 instances, Auto Scaling groups, ECS tasks on Fargate, Lambda functions, and EBS volumes
Available at no additional cost (CloudWatch metrics are the input — standard monitoring costs apply)

Memory data requires the CloudWatch Agent. Without it, Compute Optimizer cannot see memory utilization and may miss over-provisioned instances where CPU is low but memory is high (common for Java applications).

Graviton Instances

AWS Graviton (ARM-based) processors deliver 20–40% better price/performance than equivalent x86 instances for most general-purpose workloads:

x86 Instance	Graviton Equivalent	Typical Savings
m5.large	m6g.large	~20%
c5.xlarge	c6g.xlarge	~20–30%
r5.2xlarge	r6g.2xlarge	~20%

Migration requires the application to build for ARM64. AWS-managed runtimes (Lambda, ECS on Fargate, RDS, ElastiCache) support Graviton natively with no code change — selecting a Graviton instance type in the console is sufficient. For custom application code, recompile for ARM64.

Storage Right-Sizing

EBS volume right-sizing is frequently overlooked:

Action	Savings
Migrate gp2 → gp3	gp3 is cheaper than gp2 at the same size. gp3 IOPS and throughput are configurable independently — no need to over-size volume to get IOPS.
Delete unattached EBS volumes	Unattached volumes continue to accrue storage charges. Snapshot first, then delete.
Rightsize provisioned IOPS (io1/io2)	io1 and io2 charge per provisioned IOPS regardless of actual IOPS consumed. Right-size to actual peak IOPS.
EBS Snapshot lifecycle policies	Old snapshots accumulate at $0.05/GB/month. Implement Data Lifecycle Manager policies to expire snapshots.

Cost Visibility Tools

AWS Cost Explorer

Cost Explorer is the primary cost analysis tool. It provides:

Spend by dimension: Break down costs by service, account, region, availability zone, instance type, purchase option, or resource tag
Time granularity: Daily, monthly, or hourly views (hourly costs available for the last 14 days)
Forecasting: ML-based cost forecasting for the next 12 months based on historical patterns
RI/SP coverage reports: What percentage of your EC2 usage is covered by Savings Plans or RIs? Identify uncovered usage that is paying full On-Demand rates.
RI/SP utilization reports: Are you fully utilizing the commitments you purchased? Low utilization means wasted commitment spend.

Cost Explorer data is available within 24 hours of incurring charges. It is the right tool for trend analysis, cost attribution, and planning purchasing commitments.

AWS Budgets

Budgets set spending thresholds and alert when actual or forecasted costs approach or exceed the threshold:

Budget Type	What It Tracks
Cost budget	Total spend in dollars
Usage budget	Quantity consumed (e.g., EC2 hours, S3 GB)
RI utilization budget	Alert when RI utilization drops below a threshold
Savings Plans utilization budget	Alert when SP utilization drops below a threshold
Savings Plans coverage budget	Alert when covered usage drops below a threshold

Budget Actions: When a budget threshold is crossed, Budgets can automatically apply an IAM policy or SCP to restrict further spending. Example: when a developer sandbox account exceeds $500 for the month, automatically attach an SCP that prevents launching new EC2 instances. This enforces cost accountability without requiring manual intervention.

AWS Cost and Usage Report (CUR)

The CUR is the most detailed billing data source available:

Line-item granularity: Every usage line item — individual EC2 instance-hours, S3 GET requests, data transfer GB
Resource-level attribution: Link charges to specific resource IDs (not just service averages)
Delivery: Delivered to an S3 bucket in Parquet or CSV format, updated multiple times daily
Analysis: Query with Athena (for ad-hoc SQL) or load into Redshift (for complex reporting). Most third-party cost management platforms ingest the CUR as their primary data source.

CUR is the foundation for serious cost analytics. Cost Explorer is convenient but has limits on granularity and filtering. CUR has none of those limits — you can run any SQL query against the full billing data.

AWS Cost Anomaly Detection

Cost Anomaly Detection uses ML to establish a baseline spending pattern for each service, account, or tag and alerts when actual spending deviates unexpectedly:

Monitors continuously — not threshold-based like Budgets
Alerts on unexpected spikes before end-of-month when a manual review would catch them
Configurable monitors: by service (e.g., monitor EC2 separately from S3), by account, by cost allocation tag (e.g., monitor each team’s tagged resources independently)
Alert delivery: SNS, email, or Slack via SNS integration
Example: a misconfigured Auto Scaling group launches 500 instances instead of 5 — Anomaly Detection fires within hours, long before end-of-month billing review

AWS Trusted Advisor

Trusted Advisor checks your AWS environment against best practices across five categories: cost optimization, performance, security, fault tolerance, and service quotas.

Cost optimization checks:

Check	What It Finds
Idle EC2 instances	Instances with < 10% average CPU for 4 days and low network traffic
Underutilized EBS volumes	Volumes with < 1 IOPS/day for 7 days — likely unused
Unassociated Elastic IP addresses	EIPs not attached to any running instance — charged at $0.005/hr
RIs expiring soon	Reserved Instances expiring in the next 30 days — renew or allow to revert to On-Demand
Low RI utilization	RIs used < 80% over the past 30 days — potential waste

Security checks (relevant to cost — a compromised account often drives unexpected spend):

Check	What It Finds
Security groups with unrestricted access	Ports open to 0.0.0.0/0
MFA on root account	Root account without MFA
S3 bucket permissions	Buckets with public access
Public RDS snapshots	Snapshots accessible to any AWS account

Full Trusted Advisor access (all checks, programmatic access) requires a Business or Enterprise Support plan. The Developer Support plan and free tier provide access to a limited subset of checks.

Architectural Cost Patterns

Architectural decisions have more impact on cost than purchasing model choices. Choosing the wrong architecture and then buying Savings Plans for it still leaves substantial waste.

Use Managed Services (Total Cost of Ownership)

Self-managed EC2 databases appear cheaper per hour but require:

Manual backup configuration, testing, and monitoring
OS and database engine patching
Multi-AZ replication setup and failover testing
Scaling — add read replicas, resize instances, storage auto-scaling
Expert operational knowledge

Amazon RDS costs more per hour than an equivalent EC2 instance. But it delivers automated backups, Multi-AZ failover, read replica provisioning, storage auto-scaling, and engine patching. For most organizations, the total cost of ownership including engineering time and operational risk favors RDS.

The same logic applies to ElastiCache vs self-managed Redis/Memcached, MSK vs self-managed Kafka, and OpenSearch Service vs self-managed Elasticsearch.

S3 Storage Lifecycle Management

S3 Standard charges for every byte stored every month. Not all data is accessed every month:

S3 Storage Class	Use Case	Cost vs Standard
S3 Standard	Frequently accessed (daily)	Baseline
S3 Standard-IA	Infrequent access (monthly)	~60% cheaper storage, retrieval fee applies
S3 Glacier Instant Retrieval	Archive accessed occasionally (quarterly)	~80% cheaper, millisecond retrieval
S3 Glacier Flexible Retrieval	Archive, retrieval in minutes to hours	~90% cheaper
S3 Glacier Deep Archive	Long-term archive, 12-hour retrieval	~95% cheaper

S3 Intelligent-Tiering: Automatically moves objects between Standard and Infrequent Access tiers based on access patterns. Small monthly monitoring fee per object. Eliminates the need to predict access patterns — suitable for data lakes and application storage where access patterns are unpredictable.

Lifecycle rules: Explicit time-based transitions without the monitoring fee:

Day 0: Upload to S3 Standard
Day 30: Transition to Standard-IA
Day 90: Transition to Glacier Instant Retrieval
Day 365: Transition to Glacier Deep Archive
Day 2555 (7 years): Expire (delete)

Layered Purchasing Model for Auto Scaling Workloads

A common pattern for variable workloads:

Total capacity = Savings Plan baseline + On-Demand burst + Spot for batch

Example: a web application that needs 20 instances at night and 80 instances midday
- Purchase Savings Plans covering 20 instances (always running)
- On-Demand covers midday burst from 20 to 80 instances (no commitment, flexible)
- Spot Fleet runs nightly batch processing jobs (not the web tier — Spot is for fault-tolerant work)

This approach maximizes savings on the predictable portion while retaining flexibility for the variable portion and using Spot for appropriate batch work.

Eliminate Waste

Waste Type	Detection	Remediation
Unattached EBS volumes	Trusted Advisor, Config rule	Snapshot + delete
Idle EC2 instances	Trusted Advisor, Compute Optimizer	Terminate or stop
Unassociated Elastic IPs	Trusted Advisor	Release
Old EBS snapshots	Cost Explorer, DLM audit	Expire via DLM lifecycle policy
Idle RDS instances	Trusted Advisor, CloudWatch (0 connections)	Stop (temporary) or delete
Large EC2 instances doing small work	Compute Optimizer	Right-size
Cross-AZ data transfer	VPC Flow Logs analysis	Collocate compute and data in same AZ

Tagging Strategy for Cost Attribution

Without resource tags, you cannot attribute spending to business units, teams, products, or environments. Cost Explorer can break down by service but not by “which team caused this EC2 spend.”

Mandatory tag strategy:

Tag Key	Example Values	Purpose
`Environment`	prod, staging, dev, sandbox	Separate production vs non-production costs
`Team`	platform, data, frontend, backend	Attribute spending to engineering teams
`Project`	project-x, project-y	Track project-specific costs
`CostCenter`	1001, 1002, 1003	Finance attribution for chargeback or showback

Enable cost allocation tags in the Billing console after applying tags to resources. Tags appear in Cost Explorer breakdowns and in the CUR within 24 hours of activation.

Enforce tagging compliance with AWS Config custom rules — flag any EC2 instance, RDS instance, or S3 bucket missing required tags as NON_COMPLIANT. Combine with Budget Actions to stop non-tagged instances.

Cost Optimization Framework

The AWS Cost Optimization Pillar from the Well-Architected Framework defines five practices:

Practice Cloud Financial Management: Establish a cloud cost function, define ownership, implement chargeback or showback, hold teams accountable.
Expenditure and usage awareness: Tag resources, use Cost Explorer and CUR, set Budgets, review regularly.
Cost-effective resources: Select the right service and instance type. Match compute to workload characteristics. Use Graviton where applicable.
Manage demand and supply resources: Auto Scaling matches supply to demand. Avoid over-provisioning for peak — scale instead.
Optimize over time: As AWS releases new services and instance types, evaluate whether migrating reduces cost. Revisit purchasing model commitments annually.

Applied as a cycle:

Measure: CloudWatch → Cost Explorer → CUR → Trusted Advisor
Identify: Over-provisioned resources, unoptimized purchasing, architectural waste
Act: Right-size, purchase Savings Plans, apply lifecycle policies, eliminate unused resources
Verify: Measure again — confirm cost reduction without performance impact
Repeat: Cost optimization never ends; AWS releases better options continuously

Cost Management Flow

Finance

Cost Explorer

►

Monthly cost review — EC2 over-spend identified

EC2 costs up 40% vs prior month — no corresponding traffic increase

◄

Right-sizing report: 12 instances over-provisioned

Compute Optimizer: m5.4xlarge → m5.xlarge recommended (avg CPU 6%)

◄

Apply right-sizing recommendations

Resize 12 instances; migrate 8 x86 instances to Graviton equivalents

◄

Purchase Compute Savings Plans for new baseline

Commit to $8/hr (covers ~40 m6g.xlarge equivalent) — 54% discount

◄

Deploy Spot Fleet for nightly batch jobs

Batch processing: 90% discount; stateless workers, S3 checkpointing

◄

Budget alert configured at 110% of forecast

SNS alert if monthly EC2 spend exceeds forecast — catch anomalies early