Managing point-in-time recovery

The pg_dump utility is more of a traditional form of creating a backup. It makes sense for a small amount of data, but it tends to have its limitations as soon as the amount of data grows beyond a certain limit. Don't get me wrong; pg_dump works perfectly even with terabytes of data. However, let's assume you've got a dump of a 10 TB beast! Does it really make sense to replay a 10 TB database from a dump? Just consider all the indexes that have to be built, and consider the insane amount of time it will take to do that. It definitely makes sense to use a different method. This method is called point-in-time recovery (PITR), or simply xlog archiving. In this section, you will learn about PITR in detail.

How PITR works

The idea behind PITR is to take a snapshot of the data and archive the transaction log created by PostgreSQL from then on. In case of a crash, it is then possible to restore any given point in time after the initial backup (base backup) has finished. The beauty of this concept is that end users won't lose too much data in case of a crash.

In case the system crashes, administrators can return to their last base backup and replay as much data as necessary.

Preparing PostgreSQL for PITR

To configure PostgreSQL for point-in-time recovery, only three settings, which can be found in postgresql.conf, are important: wal_level, archive_mode, and archive_command. The wal_level setting tells the system to produce enough of a transaction log for recovery based on xlog. By default, PostgreSQL only produces enough xlog to repair itself in case of a crash. This is not enough.

The archive_mode setting tells the system to archive its transaction log in a safe place. The archive_command setting will tell the system how to do that.

For getting started with transaction log archiving, the first thing to do is to actually create an archive (ideally on a remote box). For the sake of simplicity, xlog will be archived to /archive in this example:

mkdir /archive

Then postgresql.conf can be changed:

wal_level = archive
archive_mode = on 
archive_command = 'cp %p /archive/%f '

Now the database can be restarted to activate those settings.

Once those settings are active, the transaction log will gradually accumulate in /archive directly. Whenever a 16 MB segment has been filled up by the system, it will be sent to the archive using archive_command. As changes come in, our /archive directory might begin to look like this:

$ ls -l
total 131072
-rw-------  1 hs  wheel  16777216 Dec  8 12:24 000000010000000000000001
-rw-------  1 hs  wheel  16777216 Dec  8 12:26 000000010000000000000002
-rw-------  1 hs  wheel  16777216 Dec  8 12:26 000000010000000000000003
-rw-------  1 hs  wheel  16777216 Dec  8 12:26 000000010000000000000004

By now, the transaction log is already archived properly.

Taking base backups

Once those xlog files are sent to the archive, it is possible to take a simple base backup. There are basically two ways of doing this:

  • SELECT pg_start_backup / pg_stop_backup: The traditional way
  • pg_basebackup: The modern way

In this chapter, both methods will be described. In the case of PITR, the traditional way will be used.

Here is how it works. First of all a simple SQL command has to be called as the superuser:

test=# SELECT pg_start_backup('some label'),
 pg_start_backup 
-----------------
 0/7000028
(1 row)

If this takes a while for you, don't worry. Behind the scenes, the system will wait for a checkpoint to happen and then return a transaction log position. Actually, nothing fancy happens behind the scenes. All it does is to tell the system where to restart replaying xlog.

Once this function has completed, it is possible to copy the entire database instance to a safe place. Keep in mind that this can be done during production, and it is not necessary to shut down the system during this operation.

Also, don't worry too much that files might vanish or appear during the backup process. This is a natural thing. Keep in mind that no change in xlog ever happens that cannot be found.

Consider the following example:

mkdir /base
cd $PGDATA
cp -Rv * /base/

You can use a simple copy, rsync, ftp, or any other means. Just ensure that the entire directory has been backed up nicely.

Once this is done, the backup process can be stopped again:

test=# SELECT pg_stop_backup();
NOTICE:  pg_stop_backup complete, all required WAL
         segments have been archived
 pg_stop_backup 
----------------
 0/70000B8
(1 row)

Remember that all of those operations can happen while the system is still active. No downtime is needed.

After the backup, a .backup file will show up in the /archive directory. It contains some important information:

$ cat 000000010000000000000007.00000028.backup
START WAL LOCATION: 0/7000028 
  (file 000000010000000000000007)
STOP WAL LOCATION: 0/70000B8 
  (file 000000010000000000000007)
CHECKPOINT LOCATION: 0/7000028
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2014-12-08 12:33:50 CET
LABEL: some label
STOP TIME: 2014-12-08 12:39:46 CET

The .backup file shows when the backup has been created. It will also tell us the oldest xlog file needed to perform the replay process. In this case, it is the 000000010000000000000007 file. Everything older than this file can be safely deleted from the server.

Replaying xlog

If the system is actively used, more transaction log will accumulate over time in /archive:

-rw-------  1 hs  wheel       294 Dec  8 12:39 000000010000000000000007.00000028.backup
-rw-------  1 hs  wheel  16777216 Dec  8 12:45 000000010000000000000008
-rw-------  1 hs  wheel  16777216 Dec  8 12:45 000000010000000000000009
-rw-------  1 hs  wheel  16777216 Dec  8 12:45 00000001000000000000000A

Let's assume now that the server we want to protect has died and the base backup needs to be recovered using PITR.

To make this work, the first thing to do is to come up with a file called recovery.conf:

$ cat recovery.conf 
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2025-04-05 14:32'

Basically, just one command is needed here. The restore_command function will copy the desired files from the archive to the location needed by PostgreSQL (%p). PostgreSQL will call restore_command for one file after the other until it runs out of xlog or until the desired recovery_target_time is reached. Actually, the recovery_target_time command is redundant. If it is not there, PostgreSQL will recover as far as possible, and not stop at the desired point in time.

Once the recovery.conf file is in place, administrators have to ensure that the directory containing the base backup is set to 700:

chmod 700 /base

Then the server can be fired up (in this example, the log is configured to show up on the screen):

$ pg_ctl -D /tmp/base/ start
server starting
$ LOG:  database system was interrupted; last known up
   at 2014-12-08 12:33:50 CET
LOG:  starting point-in-time recovery to 
  2025-04-05 14:32:00+02

The server mentions that it was interrupted and will tell us the point in time the system is intended to roll forward to.

Then one file after the other is copied from the archive and replayed:

LOG:  restored log file "000000010000000000000007" from archive
LOG:  redo starts at 0/7000090
LOG:  consistent recovery state reached at 0/70000B8
LOG:  restored log file "000000010000000000000008" from archive
LOG:  restored log file "000000010000000000000009" from archive
LOG:  restored log file "00000001000000000000000A" from archive
LOG:  restored log file "00000001000000000000000B" from archive

At some point, PostgreSQL will throw an error because it cannot copy the next file. This makes sense because this chapter has been written in late 2014, and therefore, 2025 cannot be reached easily with the existing xlog. But don't worry! PostgreSQL will figure out that there is no more xlog and just go live. If a date between the base backup and the last xlog file is selected, PostgreSQL will go live without issuing any error:

cp: /tmp/archive/00000001000000000000000C: No such file or directory
LOG:  redo done at 0/BD3C218
LOG:  last completed transaction was at log time 2014-12-08 12:46:11.755618+01
LOG:  restored log file "00000001000000000000000B" from archive
cp: /tmp/archive/00000002.history: No such file or directory
LOG:  selected new timeline ID: 2
cp: /tmp/archive/00000001.history: No such file or directory
LOG:  archive recovery complete
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections

The recovery process has finished nicely, and the database instance is ready for action. Note that at the end of the process, recovery.conf is renamed to recovery.done.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset