Cloud Cost Optimization: Beyond Reserved Instances

April 8, 2019

Cloud costs have a way of surprising organizations. The flexibility that makes cloud attractive also makes it easy to overspend. What starts as affordable experimentation becomes significant expense as services scale.

Here’s how to optimize cloud spending systematically.

Understanding Your Bill

Cost Breakdown

Before optimizing, understand where money goes:

Typical breakdown:
├── Compute (EC2, GCE): 40-60%
├── Storage (S3, EBS): 15-25%
├── Data Transfer: 10-20%
├── Database Services: 10-20%
└── Other Services: 5-15%

Use Cost Explorer (AWS), Billing Reports (GCP), or Cost Analysis (Azure).

Tagging for Attribution

Without tags, cost attribution is guesswork:

# Required tags
Environment: production | staging | development
Team: platform | orders | payments
Service: api | worker | web
CostCenter: CC-12345

Enforce tagging:

Cost Anomaly Detection

Set up alerts for unusual spending:

# AWS Cost Anomaly Detection
aws ce create-anomaly-monitor \
    --anomaly-monitor \
    MonitorName=SpendAnomaly,\
    MonitorType=DIMENSIONAL

Compute Optimization

Right-Sizing

Most instances are oversized:

Before: m5.2xlarge (8 vCPU, 32 GB) - $0.384/hour
Actual usage: 15% CPU, 8 GB memory
After: m5.large (2 vCPU, 8 GB) - $0.096/hour
Savings: 75%

Tools:

Process:

  1. Monitor actual utilization
  2. Compare against instance specs
  3. Right-size during maintenance windows
  4. Monitor for performance impact

Spot/Preemptible Instances

60-90% savings for interruptible workloads:

# Good for spot:
- CI/CD workers
- Batch processing
- Dev/test environments
- Stateless services with multiple replicas

# Bad for spot:
- Single-instance databases
- Stateful services
- Latency-sensitive applications

Strategies:

Reserved Instances / Savings Plans

Commit for predictable workloads:

OptionDiscountFlexibility
On-Demand0%Maximum
Savings Plans (1yr)25-40%Instance family
Reserved (1yr)30-40%Specific instance
Reserved (3yr)50-60%Specific instance

When to commit:

Calculate carefully:

Break-even: ~7-9 months for 1-year
If utilization < 65%, may lose money

Scheduled Scaling

Not everything runs 24/7:

# Scale down dev/staging at night
schedule:
  - cron: "0 20 * * MON-FRI"
    action: scale-to-zero
  - cron: "0 8 * * MON-FRI"
    action: scale-to-normal

40 hours/week vs 168 hours = 76% savings.

Storage Optimization

Storage Classes

Use appropriate tiers:

ClassUse CasePrice (AWS)
StandardFrequent access$0.023/GB
Infrequent AccessMonthly access$0.0125/GB
GlacierArchival$0.004/GB
Glacier Deep ArchiveRare access$0.00099/GB

Lifecycle Policies

Automate tier transitions:

{
  "Rules": [{
    "ID": "Archive old data",
    "Status": "Enabled",
    "Transitions": [
      {"Days": 30, "StorageClass": "STANDARD_IA"},
      {"Days": 90, "StorageClass": "GLACIER"},
      {"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
    ],
    "Expiration": {"Days": 2555}
  }]
}

Delete Unused Resources

Common orphaned resources:

Cleanup script:

# Find unattached volumes
aws ec2 describe-volumes \
    --filters "Name=status,Values=available" \
    --query 'Volumes[*].VolumeId'

# Delete old snapshots (> 90 days)
aws ec2 describe-snapshots \
    --owner-ids self \
    --query 'Snapshots[?StartTime<`2019-01-01`].SnapshotId'

S3 Intelligent-Tiering

Automatic tier optimization:

# Enable for buckets with unknown access patterns
aws s3api put-bucket-intelligent-tiering-configuration \
    --bucket my-bucket \
    --id tier-config \
    --intelligent-tiering-configuration \
    '{"Status": "Enabled", "Tierings": [...]}'

Data Transfer Optimization

Understand Transfer Costs

Free:
- Inbound data
- Same-AZ within AWS

Costs money:
- Cross-AZ: $0.01/GB
- Cross-region: $0.02/GB
- Outbound to internet: $0.09/GB
- NAT Gateway: $0.045/GB

VPC Endpoints

Avoid NAT Gateway for AWS services:

# S3 Gateway Endpoint (free)
aws ec2 create-vpc-endpoint \
    --vpc-id vpc-123 \
    --service-name com.amazonaws.us-east-1.s3 \
    --route-table-ids rtb-456

Saves NAT Gateway data processing fees.

CloudFront for Egress

CDN egress is cheaper than direct:

Direct S3 egress: $0.09/GB
CloudFront egress: $0.085/GB (first 10TB)

Plus caching reduces origin requests.

Keep Data in Region

Cross-region transfer adds up:

10 TB/month cross-region = $200/month
Keep data local when possible

Database Optimization

Right-Size RDS

Database instances are often oversized:

Aurora Serverless for Variable Workloads

Pay per ACU-second:

# Good for:
- Development databases
- Infrequently used applications
- Variable workloads with idle periods

Read Replicas vs. Scaling Up

Reads scale out cheaper than scaling up:

Option A: db.r5.4xlarge ($2.30/hr)
Option B: db.r5.large + 3 read replicas ($1.15/hr)

Application must support read/write splitting.

FinOps Practices

Monthly Cost Reviews

Regular review cadence:

Weekly: Anomaly review
Monthly: Detailed cost analysis
Quarterly: Architecture review for optimization

Cost Ownership

Teams own their costs:

Showback/Chargeback

Make costs visible:

Team A:
- Compute: $5,000
- Storage: $800
- Data Transfer: $400
Total: $6,200

Trend: +12% from last month

Key Takeaways

Cloud cost optimization is ongoing. Measure, optimize, measure again.