FinOps in 30 Days: How We Cut a Fintech Startup's AWS Bill by 45%

A Berlin-based fintech startup reached out to me last year with a straightforward problem: their AWS bill had tripled in 18 months and was growing faster than their revenue. They were spending $33,000 per month — $400,000 annualized — and the CTO suspected at least a third of it was waste, but nobody on the team had the time or expertise to investigate.

This is a story about what we found and fixed in 30 days. The result: a 45% reduction in AWS spend, from $400K to $220K annualized, saving $180K per year. Every action we took is reproducible, and the process we established continues to prevent cost drift months later.

Week 1: Discovery and Quick Wins ($60K in Waste Identified)

The first week was entirely diagnostic. I requested read-only access to the AWS account (a properly scoped IAM role — never ask for admin access during discovery) and spent two days building a complete picture of the environment.

What I found in Cost Explorer:

The top five services by spend:

EC2 — $14,200/month (43% of total)
RDS — $6,100/month (18%)
NAT Gateway — $3,800/month (12%)
S3 — $2,900/month (9%)
ElastiCache — $2,100/month (6%)

The remaining 12% was spread across Lambda, CloudWatch, ECS, and dozens of smaller services.

Immediate red flags:

Three non-production environments (dev, staging, QA) running 24/7 with production-sized infrastructure. Dev and QA were idle 16 hours per day on weekdays and entirely idle on weekends.
47 unattached EBS volumes totaling 8.2 TB — leftovers from terminated instances. Cost: $656/month doing absolutely nothing.
An entire RDS Multi-AZ cluster for a reporting database that was queried once per day by a single cron job. Cost: $1,400/month.
22 Elastic IPs not attached to any running instance. Cost: $80/month. Small individually, but indicative of a broader cleanup problem.

I presented these findings on Friday of the first week with a prioritized list of quick wins. The team approved all of them, and we deleted the unattached volumes and released the unused EIPs that same afternoon. That ten-minute cleanup saved $736/month ($8,832/year).

Week 2: Right-Sizing and NAT Gateway Optimization

EC2 Fleet Right-Sizing

The production environment ran 14 EC2 instances across three auto-scaling groups. I enrolled the account in AWS Compute Optimizer and analyzed two weeks of CloudWatch CPU and memory metrics.

The findings:

Instance	Current Type	Avg CPU	Avg Memory	Recommended
API servers (6x)	m5.2xlarge	12%	31%	m6i.xlarge
Worker nodes (4x)	c5.2xlarge	8%	22%	c6i.large
Admin/internal (4x)	m5.xlarge	4%	15%	m6i.large

The API servers were running m5.2xlarge instances (8 vCPU, 32 GB RAM) at 12% average CPU utilization. We moved them to m6i.xlarge (4 vCPU, 16 GB RAM) — half the size, newer generation (better price-performance), and still running comfortably at 24% average CPU after the change.

Total EC2 savings from right-sizing: $5,100/month.

Non-Production Scheduling

For the three non-production environments, we implemented an Instance Scheduler using AWS Systems Manager and EventBridge rules. Dev and QA environments now run from 7 AM to 8 PM on weekdays only. Staging runs 24/7 because the team uses it for overnight integration tests, but we right-sized those instances too.

Running dev and QA only 65 hours per week instead of 168 hours saved $2,800/month across EC2, RDS, and ElastiCache resources in those environments.

NAT Gateway Optimization

The $3,800/month NAT Gateway bill was the third-highest line item. The architecture had three NAT Gateways (one per AZ), which is a solid high-availability pattern, but the data processing charges were enormous because every outbound request from every container in every private subnet traversed the NAT Gateway.

We deployed VPC Interface Endpoints for:

Amazon ECR (api and dkr) — container image pulls were the biggest NAT Gateway consumer
Amazon S3 (Gateway Endpoint — free)
CloudWatch Logs
AWS STS
AWS Secrets Manager

After deploying the endpoints: NAT Gateway bill dropped from $3,800/month to $620/month. Net savings after endpoint costs: $2,940/month.

Reporting Database

The RDS Multi-AZ db.r5.xlarge instance used solely for daily reporting was replaced with a db.t4g.medium Single-AZ instance. The reporting workload ran for 20 minutes per day and needed neither the compute power nor the high availability of the original setup. We also moved the daily report to query a read replica of the production database instead, which allowed us to eliminate the dedicated reporting database entirely after confirming the replica handled the query load without impacting production reads.

Savings: $1,400/month.

Week 3: Commitment Discounts

By this point, we had eliminated waste and right-sized the fleet. The monthly spend had dropped from $33,000 to approximately $20,100. Now it was time to lock in savings on the stable baseline.

I analyzed 60 days of usage data using the Savings Plans purchase recommendations API. The baseline was clear: 70% of the remaining compute spend was steady-state, running 24/7 with minimal variation.

We purchased:

1-Year Compute Savings Plan (No Upfront) covering the steady-state compute baseline — this applied across EC2, Fargate, and Lambda usage
1-Year RDS Reserved Instance for the production database cluster, which had been running the same instance type for two years and was not going to change

The Compute Savings Plan provided a 20% discount on covered usage, and the RDS Reserved Instance provided a 35% discount. Combined commitment discount savings: $3,200/month.

I recommended against 3-Year commitments at this stage. The company was growing and their architecture was likely to evolve. A 1-year commitment provides meaningful savings with lower risk. We would revisit the commitment strategy at renewal time.

Week 4: Process, Tagging, and Alerts

Cost optimization is not a project with an end date — it is a practice. Without ongoing processes, costs drift back up within 3-6 months. Week 4 was about building the organizational muscle to maintain the savings.

Tagging Strategy

We implemented a mandatory tagging policy with four required tags:

Environment — production, staging, dev, qa
Team — the team responsible for the resource
Service — the application or microservice the resource belongs to
CostCenter — maps to the internal budget owner

We used AWS Organizations Tag Policies to enforce these tags and created an SCP (Service Control Policy) that prevented the creation of EC2 instances, RDS instances, and ECS services without the required tags.

Cost Alerts and Anomaly Detection

We configured:

AWS Budgets — monthly budget of $22,000 with alerts at 80%, 90%, and 100%
Cost Anomaly Detection — monitors for unexpected spend spikes by service, with a $500 daily threshold
Weekly Cost Report — a Lambda function that queries Cost Explorer every Monday morning and posts a summary to the team's Slack channel with week-over-week comparison

FinOps Review Cadence

We established a bi-weekly 30-minute FinOps review meeting with the engineering leads. The agenda:

Review total spend vs. budget
Review any cost anomaly alerts from the past two weeks
Review Compute Optimizer recommendations for new instances
Check Savings Plan coverage and utilization
Review untagged resources and assign owners

The Results

Metric	Before	After	Change
Monthly AWS spend	$33,000	$18,300	-45%
Annualized spend	$400,000	$220,000	-$180,000
EC2 instance count (prod)	14	14	Same
Average CPU utilization	10%	25%	Better utilized
Savings Plan coverage	0%	72%	Right-sized
Tagged resources	34%	96%	Accountable
Non-prod runtime	24/7	Business hours	Smart scheduling

The 45% reduction broke down as follows:

Right-sizing and scheduling: $8,300/month (46% of savings)
NAT Gateway optimization: $2,940/month (16%)
Commitment discounts: $3,200/month (22%)
Resource cleanup: $2,160/month (15%)

What Made This Work

Three factors made this engagement successful:

Executive sponsorship. The CTO attended the kickoff and the final review, and communicated to the team that cost optimization was a priority. Without this, right-sizing recommendations get stuck in review queues and scheduling changes get delayed indefinitely.

Data-driven decisions. Every change was backed by CloudWatch metrics, Compute Optimizer recommendations, or Cost Explorer data. We did not guess — we measured, changed, and measured again.

Process over projects. The tagging policy, cost alerts, and bi-weekly reviews are what prevent cost drift. Six months later, the client's monthly spend has stayed within 5% of the optimized baseline, even as their user base grew 30%.

Is Your AWS Bill Higher Than It Should Be?

If your monthly AWS spend exceeds $15K and you have not done a structured cost optimization exercise in the past 12 months, you are almost certainly overspending by 25-40%. The patterns I described above repeat themselves in nearly every account I audit.

I offer a free 30-minute consultation where we walk through your Cost Explorer data together and I identify the top opportunities for savings.

Book a free consultation — no commitment, no sales pitch, just an honest look at the numbers.