Configuring for archiving

The first thing we want to do in this chapter is create a configuration to perform standard point-in-time recovery (PITR). There are a couple of advantages of using PITR over ordinary dumps:

We will lose less data because we can restore the data to a certain point in time and not just to the fixed backup point.
Restoring will be faster because indexes don't have to be created from scratch. They are just copied over and are ready to use.

Configuring for PITR is easy. Just a handful of changes have to be made in the postgresql.conf file:

wal_level = replica    # used to be "hot_standby" in older versions 
max_wal_senders = 10   # at least 2, better at least 2

The wal_level variable says that the server is supposed to produce enough transaction logs to allow for PITR. If the wal_level variable is set to minimal (which is the default value up to PostgreSQL 9.6), the transaction log will only contain enough information to recover a single node setup—it is not rich enough to handle replication. In PostgreSQL 10.0, the default value is already correct and there is no longer a need to change most settings.

The max_wal_senders variable will allow us to stream WAL from the server. It will allow us to use pg_basebackup to create an initial backup instead of traditional file-based copying. The advantage here is that pg_basebackup is a lot easier to use. Again, the default value in 10.0 has been changed in a way that, for 90% of all setups, no changes are needed.

The idea behind WAL streaming is that the transaction log that's created is copied to a safe place for storage. Basically, there are two means of transporting WAL:

Using pg_receivewal (up to 9.6, this is known as pg_receivexlog)
Using the filesystem as a means to archive

In this section, we will look at how to set up the second option. During normal operations, PostgreSQL keeps writing to those WAL files. When archive_mode = on in the postgresql.conf file, PostgreSQL will call the archive_command variable for every single file.

A configuration might look as follows. First, a directory storing those transaction log files can be created:

mkdir /archive
chown postgres.postgres archive

The following entries can be changed in the postgresql.conf file:

archive_mode = on
archive_command = 'cp %p /archive/%f'

A restart will enable archiving, but let's configure the pg_hba.conf file first to reduce downtime to an absolute minimum.

Note that we can put any command into the archive_command variable.

Many people use rsync, scp, and others to transport their WAL files to a safe location. If our script returns 0, PostgreSQL will assume that the file has been archived. If anything else is returned, PostgreSQL will try to archive the file again. This is necessary because the database engine has to ensure that no files are lost. To perform the recovery process, we have to have every file available; not a single file is allowed to go missing.

Table of Contents for Configuring for archiving

Create new playlist

Sign In

Sign Up

Table of Contents for
Configuring for archiving