AWS Cost Management & Optimization

AWS-COST-MANAGEMENT

Understanding and controlling AWS spending — how AWS pricing works, purchasing model trade-offs, visibility tools, and the architectural patterns that reduce waste.

awscostpricingsavings-planscost-exploreroptimization

Overview

Cloud costs are not fixed. AWS charges based on what you provision and consume. A well-architected environment running the same workload as a poorly architected one might cost 30–60% less — not from negotiation, but from engineering decisions.

Unmanaged cloud spending grows in two ways: proportionally (more usage, more cost) and superlinearly (poor architectural choices amplify cost faster than usage grows). A team that sends all inter-AZ traffic through a NAT Gateway, stores cold data in S3 Standard, and runs oversized On-Demand instances will pay dramatically more than a team that addresses each of these individually.

Cost optimization is not a one-time activity performed before a launch. It is an ongoing engineering discipline — measure, right-size, select the right purchasing model, eliminate waste, repeat.


AWS Pricing Fundamentals

AWS pricing operates across three cost dimensions for almost every service:

DimensionDescriptionExamples
ComputeInstance hours, vCPU-seconds, request countsEC2 instance-hours, Lambda GB-seconds, API Gateway requests
StorageGB-month stored, IOPS provisioned, requests madeS3 GB-month, EBS GB-month, EBS provisioned IOPS
Data transferBytes moved between services, regions, or the internetEC2 egress to internet, cross-region replication, cross-AZ traffic

These three dimensions interact. An architecture that minimizes compute might increase data transfer costs. Understanding all three before making design decisions prevents cost surprises.

Pricing Models

AWS offers five pricing models for compute. Most workloads use a combination:

ModelDiscount vs On-DemandCommitmentUse Case
On-Demand0% — baseline priceNoneUnpredictable workloads, short experiments, initial sizing
Savings Plans20–66%1 or 3 year $/hr commitmentPredictable baseline — most common choice
Reserved Instances40–72%1 or 3 year instance commitmentPredictable workloads requiring a specific instance type
Spot InstancesUp to 90%None (interruptible)Fault-tolerant batch, stateless workers, CI/CD
Free Tier100% up to limitsNoneExploration, learning, small projects

AWS Free Tier

Two categories of free tier exist:

12-month free tier (expires 12 months after account creation):

Always-free (never expires, regardless of account age):

Always-free tiers make Lambda, DynamoDB, and CloudFront genuinely free for low-traffic applications — not just during a trial period.


Data Transfer Costs — The Hidden Expense

Data transfer is frequently the most surprising AWS cost. The rules are asymmetric and non-obvious:

Traffic PathCost
Inbound (ingress) from internet to AWSFree
Outbound (egress) from EC2/S3 to internetCharged per GB — first 100 GB/month free, then tiered
Between AWS regionsCharged per GB in both directions
Between AZs in the same regionCharged per GB ($0.01/GB each direction)
Within the same AZ (private IP)Free
CloudFront to internet (egress)Charged per GB — cheaper rate than EC2 direct egress
S3 to CloudFront in same regionFree (CloudFront origin fetch from S3 is free)

The cross-AZ charge catches many teams off guard. An EC2 instance in us-east-1a querying an RDS instance in us-east-1b pays $0.01/GB in each direction — $0.02/GB round trip. For a high-throughput database application, this adds up quickly. Solutions:

CloudFront reduces egress costs: CloudFront’s per-GB egress rate is lower than EC2 direct egress. For high-traffic content (static assets, APIs, downloads), serving through CloudFront reduces both latency and cost simultaneously. S3 → CloudFront → internet is typically the cheapest path for static content delivery.


EC2 Purchasing Options in Depth

On-Demand

Highest per-hour price. No commitment, no interruption risk. On-Demand is the right choice for:

Savings Plans (Preferred Over Reserved Instances for Most Use Cases)

Savings Plans commit to a consistent dollar-per-hour spend in exchange for a discount. They are more flexible than Reserved Instances because they apply automatically to matching usage — you do not commit to a specific instance type.

Compute Savings Plans:

EC2 Instance Savings Plans:

Payment options: No Upfront (monthly payments, smallest commitment, smallest discount), Partial Upfront (some paid upfront, rest monthly, medium discount), All Upfront (full 1 or 3 year commitment paid upfront, maximum discount).

Reserved Instances

Reserved Instances (RIs) commit to a specific instance configuration in exchange for a significant discount. Two types:

RI TypeFlexibilityDiscountNotes
Standard RILocked to family, size, region, OS, tenancyUp to 72%Can be listed on the RI Marketplace to sell unused capacity
Convertible RICan exchange for different family/size/OS/regionUp to 54%Cannot sell on Marketplace

RIs are most valuable for:

For most greenfield workloads, Savings Plans are preferred because they automatically apply across instance families, regions, and services as your architecture evolves.

Spot Instances

Spot Instances use AWS’s spare EC2 capacity at up to 90% discount. The trade-off: AWS can reclaim a Spot Instance with a 2-minute warning when the capacity is needed for On-Demand customers.

Designing for Spot:

Design PrincipleImplementation
Stateless workersStore state in S3, DynamoDB, or SQS — not on the instance
DiversificationRequest multiple instance types and AZs via Spot Fleet — if one pool is reclaimed, others continue
Interruption handlingUse the 2-minute notice (available via instance metadata) to checkpoint state and drain gracefully
Spot AdvisorUse the Spot Instance Advisor to select pools with < 5% interruption frequency
Mixed fleetsCombine On-Demand base capacity with Spot for burst — Auto Scaling Group with mixed instance policies

Spot is ideal for: batch data processing, CI/CD build workers, ML training jobs, rendering farms, large-scale testing.


Right-Sizing

Over-provisioned instances are the single most common source of EC2 waste. CPU utilization averaging 5–10% on a large instance means you are paying for 90–95% of the instance’s capacity without using it.

AWS Compute Optimizer

Compute Optimizer analyzes CloudWatch metrics (CPU, network, disk I/O, and memory if the CloudWatch Agent is installed) over a 14-day window and recommends right-sized alternatives:

Memory data requires the CloudWatch Agent. Without it, Compute Optimizer cannot see memory utilization and may miss over-provisioned instances where CPU is low but memory is high (common for Java applications).

Graviton Instances

AWS Graviton (ARM-based) processors deliver 20–40% better price/performance than equivalent x86 instances for most general-purpose workloads:

x86 InstanceGraviton EquivalentTypical Savings
m5.largem6g.large~20%
c5.xlargec6g.xlarge~20–30%
r5.2xlarger6g.2xlarge~20%

Migration requires the application to build for ARM64. AWS-managed runtimes (Lambda, ECS on Fargate, RDS, ElastiCache) support Graviton natively with no code change — selecting a Graviton instance type in the console is sufficient. For custom application code, recompile for ARM64.

Storage Right-Sizing

EBS volume right-sizing is frequently overlooked:

ActionSavings
Migrate gp2 → gp3gp3 is cheaper than gp2 at the same size. gp3 IOPS and throughput are configurable independently — no need to over-size volume to get IOPS.
Delete unattached EBS volumesUnattached volumes continue to accrue storage charges. Snapshot first, then delete.
Rightsize provisioned IOPS (io1/io2)io1 and io2 charge per provisioned IOPS regardless of actual IOPS consumed. Right-size to actual peak IOPS.
EBS Snapshot lifecycle policiesOld snapshots accumulate at $0.05/GB/month. Implement Data Lifecycle Manager policies to expire snapshots.

Cost Visibility Tools

AWS Cost Explorer

Cost Explorer is the primary cost analysis tool. It provides:

Cost Explorer data is available within 24 hours of incurring charges. It is the right tool for trend analysis, cost attribution, and planning purchasing commitments.

AWS Budgets

Budgets set spending thresholds and alert when actual or forecasted costs approach or exceed the threshold:

Budget TypeWhat It Tracks
Cost budgetTotal spend in dollars
Usage budgetQuantity consumed (e.g., EC2 hours, S3 GB)
RI utilization budgetAlert when RI utilization drops below a threshold
Savings Plans utilization budgetAlert when SP utilization drops below a threshold
Savings Plans coverage budgetAlert when covered usage drops below a threshold

Budget Actions: When a budget threshold is crossed, Budgets can automatically apply an IAM policy or SCP to restrict further spending. Example: when a developer sandbox account exceeds $500 for the month, automatically attach an SCP that prevents launching new EC2 instances. This enforces cost accountability without requiring manual intervention.

AWS Cost and Usage Report (CUR)

The CUR is the most detailed billing data source available:

CUR is the foundation for serious cost analytics. Cost Explorer is convenient but has limits on granularity and filtering. CUR has none of those limits — you can run any SQL query against the full billing data.

AWS Cost Anomaly Detection

Cost Anomaly Detection uses ML to establish a baseline spending pattern for each service, account, or tag and alerts when actual spending deviates unexpectedly:


AWS Trusted Advisor

Trusted Advisor checks your AWS environment against best practices across five categories: cost optimization, performance, security, fault tolerance, and service quotas.

Cost optimization checks:

CheckWhat It Finds
Idle EC2 instancesInstances with < 10% average CPU for 4 days and low network traffic
Underutilized EBS volumesVolumes with < 1 IOPS/day for 7 days — likely unused
Unassociated Elastic IP addressesEIPs not attached to any running instance — charged at $0.005/hr
RIs expiring soonReserved Instances expiring in the next 30 days — renew or allow to revert to On-Demand
Low RI utilizationRIs used < 80% over the past 30 days — potential waste

Security checks (relevant to cost — a compromised account often drives unexpected spend):

CheckWhat It Finds
Security groups with unrestricted accessPorts open to 0.0.0.0/0
MFA on root accountRoot account without MFA
S3 bucket permissionsBuckets with public access
Public RDS snapshotsSnapshots accessible to any AWS account

Full Trusted Advisor access (all checks, programmatic access) requires a Business or Enterprise Support plan. The Developer Support plan and free tier provide access to a limited subset of checks.


Architectural Cost Patterns

Architectural decisions have more impact on cost than purchasing model choices. Choosing the wrong architecture and then buying Savings Plans for it still leaves substantial waste.

Use Managed Services (Total Cost of Ownership)

Self-managed EC2 databases appear cheaper per hour but require:

Amazon RDS costs more per hour than an equivalent EC2 instance. But it delivers automated backups, Multi-AZ failover, read replica provisioning, storage auto-scaling, and engine patching. For most organizations, the total cost of ownership including engineering time and operational risk favors RDS.

The same logic applies to ElastiCache vs self-managed Redis/Memcached, MSK vs self-managed Kafka, and OpenSearch Service vs self-managed Elasticsearch.

S3 Storage Lifecycle Management

S3 Standard charges for every byte stored every month. Not all data is accessed every month:

S3 Storage ClassUse CaseCost vs Standard
S3 StandardFrequently accessed (daily)Baseline
S3 Standard-IAInfrequent access (monthly)~60% cheaper storage, retrieval fee applies
S3 Glacier Instant RetrievalArchive accessed occasionally (quarterly)~80% cheaper, millisecond retrieval
S3 Glacier Flexible RetrievalArchive, retrieval in minutes to hours~90% cheaper
S3 Glacier Deep ArchiveLong-term archive, 12-hour retrieval~95% cheaper

S3 Intelligent-Tiering: Automatically moves objects between Standard and Infrequent Access tiers based on access patterns. Small monthly monitoring fee per object. Eliminates the need to predict access patterns — suitable for data lakes and application storage where access patterns are unpredictable.

Lifecycle rules: Explicit time-based transitions without the monitoring fee:

Layered Purchasing Model for Auto Scaling Workloads

A common pattern for variable workloads:

Total capacity = Savings Plan baseline + On-Demand burst + Spot for batch

Example: a web application that needs 20 instances at night and 80 instances midday
- Purchase Savings Plans covering 20 instances (always running)
- On-Demand covers midday burst from 20 to 80 instances (no commitment, flexible)
- Spot Fleet runs nightly batch processing jobs (not the web tier — Spot is for fault-tolerant work)

This approach maximizes savings on the predictable portion while retaining flexibility for the variable portion and using Spot for appropriate batch work.

Eliminate Waste

Waste TypeDetectionRemediation
Unattached EBS volumesTrusted Advisor, Config ruleSnapshot + delete
Idle EC2 instancesTrusted Advisor, Compute OptimizerTerminate or stop
Unassociated Elastic IPsTrusted AdvisorRelease
Old EBS snapshotsCost Explorer, DLM auditExpire via DLM lifecycle policy
Idle RDS instancesTrusted Advisor, CloudWatch (0 connections)Stop (temporary) or delete
Large EC2 instances doing small workCompute OptimizerRight-size
Cross-AZ data transferVPC Flow Logs analysisCollocate compute and data in same AZ

Tagging Strategy for Cost Attribution

Without resource tags, you cannot attribute spending to business units, teams, products, or environments. Cost Explorer can break down by service but not by “which team caused this EC2 spend.”

Mandatory tag strategy:

Tag KeyExample ValuesPurpose
Environmentprod, staging, dev, sandboxSeparate production vs non-production costs
Teamplatform, data, frontend, backendAttribute spending to engineering teams
Projectproject-x, project-yTrack project-specific costs
CostCenter1001, 1002, 1003Finance attribution for chargeback or showback

Enable cost allocation tags in the Billing console after applying tags to resources. Tags appear in Cost Explorer breakdowns and in the CUR within 24 hours of activation.

Enforce tagging compliance with AWS Config custom rules — flag any EC2 instance, RDS instance, or S3 bucket missing required tags as NON_COMPLIANT. Combine with Budget Actions to stop non-tagged instances.


Cost Optimization Framework

The AWS Cost Optimization Pillar from the Well-Architected Framework defines five practices:

  1. Practice Cloud Financial Management: Establish a cloud cost function, define ownership, implement chargeback or showback, hold teams accountable.
  2. Expenditure and usage awareness: Tag resources, use Cost Explorer and CUR, set Budgets, review regularly.
  3. Cost-effective resources: Select the right service and instance type. Match compute to workload characteristics. Use Graviton where applicable.
  4. Manage demand and supply resources: Auto Scaling matches supply to demand. Avoid over-provisioning for peak — scale instead.
  5. Optimize over time: As AWS releases new services and instance types, evaluate whether migrating reduces cost. Revisit purchasing model commitments annually.

Applied as a cycle:

Measure: CloudWatch → Cost Explorer → CUR → Trusted Advisor
Identify: Over-provisioned resources, unoptimized purchasing, architectural waste
Act: Right-size, purchase Savings Plans, apply lifecycle policies, eliminate unused resources
Verify: Measure again — confirm cost reduction without performance impact
Repeat: Cost optimization never ends; AWS releases better options continuously

Cost Management Flow

Finance
Cost Explorer
Monthly cost review — EC2 over-spend identified
EC2 costs up 40% vs prior month — no corresponding traffic increase
Right-sizing report: 12 instances over-provisioned
Compute Optimizer: m5.4xlarge → m5.xlarge recommended (avg CPU 6%)
Apply right-sizing recommendations
Resize 12 instances; migrate 8 x86 instances to Graviton equivalents
Purchase Compute Savings Plans for new baseline
Commit to $8/hr (covers ~40 m6g.xlarge equivalent) — 54% discount
Deploy Spot Fleet for nightly batch jobs
Batch processing: 90% discount; stateless workers, S3 checkpointing
Budget alert configured at 110% of forecast
SNS alert if monthly EC2 spend exceeds forecast — catch anomalies early

References