Postgresql – downloading a big database (postgres) to a local copy

databasedatabase-backuppostgresqlrds

I have a pretty big postgres DB on Amazon RDS (about 9GB when zipped), and sometimes we need to copy it and do some tests on it on our local machines.

Doing a DB dump (pg_dump) and downloading it is simply too slow and honestly been just getting stuck last few times we tried.

Is there a simple way to get parts of the DB in a smart way? for example, get only changes from last 10 days, and then we can merge them with the local DB we have, or maybe getting DB in chunks etc?

I'm sure I'm not the first one with that need, but couldn't find a decent method or tutorial to explain the best ways to do it.

Thanks!

Best Answer

9GB compressed dump isn't really that large. You just need to make do it right:

  • dump from EC2 instance in the same AWS account and region that RDS database is running — not over the internet;
  • use directory dump format (--format=directory or -Fd) — it's automatically compressed;
  • use parallel dump (for example --jobs=16 or -j16) and parallel restore;
  • use sslmode=disable in connection string or env PGSSLMODE=disable pg_dump … to disable SSL — some versions of AWS RDS have 64GB limit of SSL data over single connection;
  • remember that RDS database restored from snapshot is terribly slow until all the data is read at least once - first read is over 10 times slower than normal, as the data is pulled from archive storage (S3 maybe) on demand — don't do this dump from a db restored recently from snapshot;
  • move it out of AWS only after successfull dump in compressed state.