RDS Storage Full: Why Your Database Ran Out of Space and How to Fix It

At 3 AM on a Tuesday, your application starts returning 500 errors. The database is not accepting writes. You check your RDS instance and see this in the console: storage is at 100% utilization. Every INSERT, UPDATE, and even some SELECT queries are failing because the database engine cannot write temporary files, transaction logs, or data pages.

An RDS instance that runs out of storage is one of the most severe failures you can experience because it is immediate, total, and surprisingly difficult to recover from quickly. Unlike CPU or memory pressure, where the database slows down gradually, a full storage volume causes hard failures with no graceful degradation.

Here is how to diagnose what consumed your storage, fix it immediately, and make sure it never happens again.

The Error Messages You Will See

Depending on your database engine, the error messages differ:

PostgreSQL:

ERROR: could not extend file "base/16384/12345": No space left on device
HINT: Check free disk space.
PANIC: could not write to file "pg_wal/000000010000000000000042": No space left on device

MySQL:

ERROR 1114 (HY000): The table 'my_table' is full
ERROR 3 (HY000): Error writing file '/rdsdbdata/tmp/MY1234aB' (Errcode: 28 - No space left on device)

Step 1: Confirm Storage Is Full

Start by checking the current storage allocation and usage:

# Get instance storage details
aws rds describe-db-instances \
  --db-instance-identifier my-database \
  --query 'DBInstances[0].{
    AllocatedStorage: AllocatedStorage,
    MaxAllocatedStorage: MaxAllocatedStorage,
    StorageType: StorageType,
    Engine: Engine,
    EngineVersion: EngineVersion,
    DBInstanceStatus: DBInstanceStatus
  }'

Then check CloudWatch for the FreeStorageSpace metric:

aws cloudwatch get-metric-statistics \
  --namespace AWS/RDS \
  --metric-name FreeStorageSpace \
  --dimensions Name=DBInstanceIdentifier,Value=my-database \
  --start-time 2026-04-20T00:00:00Z \
  --end-time 2026-04-22T12:00:00Z \
  --period 3600 \
  --statistics Minimum \
  --output table

If FreeStorageSpace has dropped to zero or near zero, you have confirmed the issue. Look at the trend over the past few days — did it decline gradually or drop suddenly? A gradual decline suggests data growth or log accumulation. A sudden drop suggests a runaway query, a bulk data load, or transaction log bloat from a long-running transaction.

Root Cause 1: Storage Auto-Scaling Not Enabled

This is the most common reason RDS instances run out of space: auto-scaling was never turned on. When you create an RDS instance, storage auto-scaling is not enabled by default. You specify an initial allocation (say 100GB) and that is all you get unless you manually increase it or enable auto-scaling.

Check if auto-scaling is enabled:

aws rds describe-db-instances \
  --db-instance-identifier my-database \
  --query 'DBInstances[0].MaxAllocatedStorage'

If this returns null or the same value as AllocatedStorage, auto-scaling is not enabled.

Enable it immediately:

aws rds modify-db-instance \
  --db-instance-identifier my-database \
  --max-allocated-storage 500 \
  --apply-immediately

This tells RDS it can automatically grow storage up to 500GB. RDS will scale storage when free space drops below 10% of allocated storage and the low-storage condition persists for at least 5 minutes. The scaling increment is the greater of 5GB or 10% of currently allocated storage.

Important: storage auto-scaling only increases storage. It never shrinks. Once your instance grows to 300GB, it stays at 300GB even if you delete most of the data.

Root Cause 2: Transaction Log Bloat

Long-running transactions are a silent storage killer. While a transaction is open, the database must retain all the log records needed to roll it back. On a busy database, a single transaction that runs for hours can cause gigabytes of log accumulation.

PostgreSQL WAL Accumulation

PostgreSQL uses Write-Ahead Logging (WAL). WAL files accumulate when:

A long-running transaction prevents WAL recycling
Replication slots are falling behind (the slot retains WAL until the replica catches up)
wal_keep_size is set too high

Check for long-running transactions:

SELECT pid, now() - xact_start AS duration, query, state
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
  AND state != 'idle'
ORDER BY duration DESC
LIMIT 10;

Check replication slot lag:

SELECT slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS slot_lag
FROM pg_replication_slots;

If a replication slot shows gigabytes of lag, that slot is preventing WAL cleanup. Either fix the replica that is using the slot or drop the slot if the replica is no longer needed:

SELECT pg_drop_replication_slot('my_old_slot');

MySQL Binary Log Retention

MySQL RDS instances retain binary logs for replication and point-in-time recovery. By default, binary logs are retained based on the binlog retention hours setting.

Check current retention:

CALL mysql.rds_show_configuration;

Check binary log disk usage:

SHOW BINARY LOGS;

If binary logs are consuming significant space, reduce the retention period:

CALL mysql.rds_set_configuration('binlog retention hours', 24);

Reducing from the default (which can be up to 168 hours for some configurations) to 24 hours can free up substantial storage.

Root Cause 3: Temporary Tablespace Growth

Both PostgreSQL and MySQL use temporary storage for sorting, hash joins, and other operations that exceed working memory. A poorly optimized query that processes millions of rows can generate gigabytes of temporary files.

PostgreSQL — check temporary file usage:

SELECT datname, temp_files, pg_size_pretty(temp_bytes) AS temp_size
FROM pg_stat_database
WHERE temp_bytes > 0
ORDER BY temp_bytes DESC;

MySQL — check temporary table usage:

SHOW GLOBAL STATUS LIKE 'Created_tmp_disk_tables';
SHOW GLOBAL STATUS LIKE 'Created_tmp_tables';

If Created_tmp_disk_tables is a high percentage of Created_tmp_tables, queries are writing large temporary tables to disk. Identify the offending queries:

SELECT query, tmp_disk_tables, tmp_tables
FROM performance_schema.events_statements_summary_by_digest
WHERE tmp_disk_tables > 0
ORDER BY tmp_disk_tables DESC
LIMIT 10;

Root Cause 4: Table Bloat and Dead Tuples (PostgreSQL)

PostgreSQL's MVCC implementation creates dead tuples when rows are updated or deleted. The VACUUM process reclaims this space, but if VACUUM falls behind (or is blocked by long-running transactions), table bloat can consume significant storage.

Check for table bloat:

SELECT schemaname, relname,
  pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
  n_dead_tup,
  n_live_tup,
  ROUND(n_dead_tup::numeric / NULLIF(n_live_tup, 0) * 100, 2) AS dead_pct,
  last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC
LIMIT 20;

If dead_pct is above 20% for large tables, VACUUM is not keeping up. Check if autovacuum is running:

SELECT * FROM pg_stat_progress_vacuum;

You can trigger a manual VACUUM on the most bloated tables:

VACUUM (VERBOSE) my_large_table;

For extreme bloat, consider VACUUM FULL, but be aware this locks the table and rewrites it entirely — not suitable for production during peak hours.

Root Cause 5: Snapshot Storage Confusion

A common misconception: RDS snapshots do not consume your instance's allocated storage. Snapshot storage is separate and billed differently. However, there is a related gotcha: when you restore from a snapshot, the restored instance's allocated storage matches the snapshot, not any auto-scaling that occurred after the snapshot was taken.

If your original instance auto-scaled from 100GB to 300GB, but your snapshot was taken when it was at 100GB, the restored instance will have only 100GB of allocated storage. If the snapshot contains 95GB of data, you will immediately be at 95% utilization on the restored instance.

# Check snapshot size before restoring
aws rds describe-db-snapshots \
  --db-snapshot-identifier my-snapshot \
  --query 'DBSnapshots[0].{AllocatedStorage: AllocatedStorage, SnapshotCreateTime: SnapshotCreateTime}'

Immediate Recovery: Increasing Storage

If your instance is currently out of space, the fastest fix is to increase the allocated storage:

aws rds modify-db-instance \
  --db-instance-identifier my-database \
  --allocated-storage 200 \
  --apply-immediately

Important caveats:

Storage modification can take 10-30 minutes to complete. During this time, the instance remains available but may have degraded IOPS performance.
You can only increase storage, never decrease it.
After a storage modification, you must wait at least 6 hours before modifying storage again. Plan your increase generously.
The instance status will show storage-optimization during the modification.

Monitor the modification progress:

aws rds describe-db-instances \
  --db-instance-identifier my-database \
  --query 'DBInstances[0].{
    Status: DBInstanceStatus,
    PendingStorage: PendingModifiedValues.AllocatedStorage,
    CurrentStorage: AllocatedStorage
  }'

IOPS Impact When Storage Is Full

When an RDS volume is full, the impact goes beyond write failures. On gp2 volumes, your baseline IOPS is directly tied to storage size (3 IOPS per GB). A 100GB gp2 volume provides only 300 baseline IOPS. When the volume is full and the database is trying to write transaction logs, those 300 IOPS are consumed by retries and internal operations, leaving nothing for actual queries.

On gp3 volumes, IOPS are configurable independently of storage, but a full volume still causes write failures regardless of available IOPS.

# Check current IOPS configuration
aws rds describe-db-instances \
  --db-instance-identifier my-database \
  --query 'DBInstances[0].{
    StorageType: StorageType,
    Iops: Iops,
    AllocatedStorage: AllocatedStorage
  }'

If you are on gp2 with a small volume, migrating to gp3 gives you 3000 baseline IOPS regardless of volume size and allows you to scale IOPS independently.

Prevention: Monitoring and Alarms

After recovering from a storage full event, implement these safeguards:

CloudWatch Alarm for Low Storage

aws cloudwatch put-metric-alarm \
  --alarm-name "rds-my-database-low-storage" \
  --namespace AWS/RDS \
  --metric-name FreeStorageSpace \
  --dimensions Name=DBInstanceIdentifier,Value=my-database \
  --statistic Minimum \
  --period 300 \
  --evaluation-periods 3 \
  --threshold 5368709120 \
  --comparison-operator LessThanThreshold \
  --alarm-actions arn:aws:sns:us-east-1:123456789:alerts \
  --alarm-description "Alert when RDS free storage drops below 5GB"

Set Multiple Thresholds

Create alarms at 20%, 10%, and 5% of allocated storage. The 20% alarm gives you time to investigate. The 10% alarm means you need to act. The 5% alarm means you are about to have an outage.

Enable Enhanced Monitoring

aws rds modify-db-instance \
  --db-instance-identifier my-database \
  --monitoring-interval 60 \
  --monitoring-role-arn arn:aws:iam::123456789:role/rds-monitoring-role \
  --apply-immediately

Enhanced Monitoring shows OS-level metrics including disk I/O, which helps you understand whether storage pressure is coming from data growth, temporary files, or transaction logs.

Regular Maintenance

Schedule weekly reviews of table sizes and bloat metrics
Set up automated cleanup of old data using database-level partitioning and partition drops
Monitor replication slot lag if you use logical replication
Review binary log retention settings quarterly

When Storage Issues Signal Deeper Problems

Repeated storage pressure often indicates architectural issues: tables that grow without bound because there is no data lifecycle policy, queries that generate excessive temporary files because they lack proper indexes, or replication setups that silently fall behind.

We help teams design database architectures that scale predictably, implement proper monitoring, and build data lifecycle policies that keep storage growth manageable. Contact us for a free AWS consultation and let us review your database setup before the next storage emergency.