Postgresql – Diagnosing low disk space error from PostgreSQL despite low disk usage

amazon ec2amazon-rdspostgresqlpostgresql-9.3

I have a Django web application using a PostgreSQL 9.3 database, which occassionally throws the error:

File "/usr/local/my_site/.env/lib/python2.7/site-packages/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/usr/local/my_site/.env/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
OperationalError: could not write block 2432320 of temporary file: No space left on device
HINT:  Perhaps out of disk space?

It's running on EC2/RDS and I can't find anything that's low on disk space. The EC2 instance has a 9GB drive that's only 38% in use. The RDS PostgreSQL database has 20GB of storage that's barely 1% in use. I thought this might be a low inode issue on the EC2 instance, but df -i shows that is only 33% in use.

What would be causing this error?

Best Answer

You are getting this error because PostgreSQL is running out of space to write a temporary file. You have at least one query that causes the database to occasionally write out a temporary table that is too large for the available space.

By default postgresql uses an empty string for the temp_tablespaces configuration; which means that temporary tables are written into the default tablespace (AKA your $DATA_DIR ). Since you are using RDS Postgres you will have to see what that setting says using

select * from pg_settings where name='temp_tablespaces';

Given the block position mentioned and using the RDS blocksize of 8192. It looks like you're writing out nearly 20GB of temporary tables which not coincidentally is the amount of tablespace you have for that database cluster.

This would suggest that you have a pathological query that builds a temporary table that is a multiple of the contents of your database. You should try logging all the queries going to your database ( see the aws docs for an example ) and see if you can spot where you are accidentally doing a cartesian join of two tables and filtering the output ( or whatever form your bad query might have taken ).

You probably want to set temp_file_limit limit to a sensible value ( I would go with 4GB ) but that will only make the underlying issue more visible since you'll hit the limit earlier.

The real solution to this is to find and isolate the query that is causing you to use all of that temporary space. The easiest way is to get it in sql and figure out why the Django ORM is producing that.