Postgresql – How to force a merge of all WAL files in pg_xlog back into the base “data” directory

bufferpostgresqltransaction-log

Question:

Is there a way to tell Postgres (9.2) to "merge all WAL files in pg_xlog back into the non-WAL data files, and then delete all WAL files successfully merged?"

I would like to be able to "force" this operation; i.e. checkpoint_segments or archiving settings should be ignored. The filesystem WAL buffer (pg_xlog) directory should be emptied, or nearly emptied. It's fine if some or all of the space consumed by the pg_xlog directory is then consumed by the data directory; our DBA has asked for file (i.e. data directory, rather than sql) database backups without any backlogged WALs, but space consumption is not a concern.

Having near-zero WAL activity during this operation is a fine constraint. I can ensure that the database server is either shut down or not connectible (zero user-generated transaction load) during this process.

Essentially, I'd like Postgres to ignore archiving/checkpoint retention policies temporarily, and flush all WAL activity to the core database files, leaving pg_xlog in the same state as if the database were recently created–with very few WAL files.

What I've Tried:

I know that the pg_basebackup utility performs something like this (it generates an almost-all-WALs-merged copy of a Postgres instance's data directory), but we aren't ready to use it on all our systems yet, as we are still testing replication settings; I'm hoping for a more short-term solution.

I've tried issuing CHECKPOINT commands, but they just recycle one WAL file and replace it with another (that is, if they do anything at all; if I issue them during database idle time, they do nothing). pg_switch_xlog() similarly just forces a switch to the next log segment; it doesn't flush all queued/buffered segments.

I've also played with the pg_resetxlog utility. That utility sort of does what I want, but all of its usage docs seem to indicate that it destroys (rather than flushing out of the transaction log and into the main data files) some or all of the WAL data. Is that impression accurate? If not, can I use pg_resetxlog during a zero-WAL-activity period to force a flush of all queued WAL data to non-WAL data? If the answer to that is negative, how can I achieve this goal?

Thanks!

Best Answer

. . . something tells me your DBA isn't a Postgres guy? :-)

Based on your comments it sounds like the closest thing to the solution you're looking for is starting up the database (using your base backup) and issuing a CHECKPOINT, then shutting down that DB and backing it up. This will flush the WAL data in the "catch-up" logs to the primary DB files and leave you with an "empty" WAL (though you'll still have a few segments hanging around that are needed to actually start the server & verify consistency).

The only other way to ensure that the backup you're grabbing has all data flushed to the main DB files is to shut down the database to make the backup.


I wouldn't advise doing either of these for static backups, which is what it sounds like you're doing. Just hang on to the backup created per the Postgres manual, and if you need to activate it start a server using it as you normally would per the manual.

I honestly can't think of a valid reason for what your DBA is requesting -- The brief startup delay while Postgres replays the log files you collected after the pg_stop_backup() command is not worth doing something strange and different rather than following the tried-and-true procedures in the manual, and the amount of testing you'd need to do to verify that any new procedure you come up with is as robust as the standard procedures makes this an unattractive option IMHO.


Obviously the procedure for slaves/streaming/hot-standby are a little different, again per the manual.
If your DBA really wants a minimum number of WAL segments I'd suggest the solution I use:

  • A slave is designated as the backup host.
  • When backup time comes we shut the slave down and take the filesystem backup
  • The slave is started up when we're done with the backup & usually catches up within 15 minutes.

Recovery from this backup is essentially the same as activating a slave -- the slave is started up and the recovery trigger file created.

There are a few tricks to setting this up - nothing that isn't covered in the manual, but obviously you want to test thoroughly.

Related Topic