Postgresql – How to make pg_dump less resource greedy

database-backupperformancepostgresql

I have configured cron to invoke pg_dump on a daily basis using following rule:

# xyz database backups:
00 01 * * * root umask 077 && pg_dump --user=xyz_system xyz | gzip > /var/xyz/backup/db/xyz/`date -u +\%Y\%m\%dT\%H\%M\%S`.gz

Basically, it works. The database grows relatively fast and exponentially (however the exponent isn't very big). Currently the gzipped dump takes about 160MB. When the database is dumped the system starts to crawl. The load average I saw using the top command was about 200, 200, 180. Basically the server is hardly responsive.

The first question is how to determine where the bottleneck is. Is the poor performance caused by heavy I/O operations? Is it caused by table locking issues? Maybe it is a memory issue? The output of the pg_dump command is piped to the gzip command. Is it sequential, i.e. entire dump is placed in the memory (swapping problem?) and then compressed or concurrent (i.e. gzip compresses what it gets and waits for more)? May it be caused by some other factor?

The second question is how to make the dumping operation less intrusive for main functions of the system. As far as I understand things, the dump can't take too much time because of database integrity. There are table write locks, etc. What can I make to limit the problems (or delay it, considering database growth).

The third question: Is it already time to learn about more advanced database configurations? The system works ok, when database backups are not performed, but maybe the db dumping issue is a first symptom of incoming problems?

Best Answer

Wow. Amazing number of questions. I will try to address some, but this answer is not complete yet.

how to determine where the bottleneck is.

Use top first to see what's going on during the dump. Inspect process CPU usage, process status. D means "waiting for I/O".

Is the poor performance caused by heavy I/O operations?

Yes, most probably.

Is it caused by table locking issues?

Maybe. you could use pg_stat_activity system view to see what's going on in postgres during the dump.

Maybe it is a memory issue?

Very unlikely.

The output of the pg_dump command is piped to the gzip command. Is it sequential, i.e. entire dump is placed in the memory (swapping problem?)

No. gzip is a block compressor working in stream mode, it does not keep all input in memory.

and then compressed or concurrent (i.e. gzip compresses what it gets and waits for more)?

Yes, it compresses block by block, outputs and waits for more.

May it be caused by some other factor?

Yes.

As far as I understand things, the dump can't take too much time because of database integrity. There are table write locks, etc. What can I make to limit the problems (or delay it, considering database growth).

The dump duration has no effect on dump integrity. Integrity is ensured by using one transaction with repeatable read isolation level by all pg_dump process. There are NO table write locks for regular INSERT/UPDATE. Only some DDL operations (DROP TABLE, ALTER TABLE etc) can be blocked during backup.

Is it already time to learn about more advanced database configurations? The system works ok, when database backups are not performed, but maybe the db dumping issue is a first symptom of incoming problems?

Never too late. Start with http://wiki.postgresql.org/wiki/Performance_Optimization.