Finding the right timestamp

So far, we have progressed under the assumption that we know the timestamp we want to recover, or that we simply want to replay the whole transaction log to reduce data loss. However, what if we don't want to replay everything? What if we don't know which point in time to recover to? In everyday life, this is actually a very common scenario. One of our developers loses some data in the morning and we are supposed to make things fine again. The trouble is this: at what time in the morning? Once the recovery has ended, it cannot be restarted easily. Once recovery is completed, the system will be promoted, and once it has been promoted, we cannot continue to replay WAL.

However, what we can do is pause recovery without promotion, check what is inside the database, and continue.

Doing that is easy. The first thing we have to make sure of is that the hot_standby variable is set to on in the postgresql.conf file. This will make sure that the database is readable while it is still in recovery mode. Then, we have to adapt the recovery.conf file before starting the replay process:

recovery_target_action = 'pause'

There are various recovery_target_action settings. If we use pause, PostgreSQL will pause at the desired time and let us check what has already been replayed. We can adjust the time we want, restart, and try again. Alternatively, we can set the value to promote or shutdown.

There is a second way to pause transaction log replay. Basically, it can also be used when performing PITR. However, in most cases, it is used with streaming replication. Here is what can be done during WAL replay:

postgres=# x
Expanded display is on.
postgres=# df *pause*
List of functions

-[ RECORD 1 ]-------+-------------------------
Schema | pg_catalog

Name | pg_is_wal_replay_paused
Result data type | boolean
Argument data types |
Type | normal
-[ RECORD 2 ]-------+-------------------------
Schema | pg_catalog

Name | pg_wal_replay_pause
Result data type | void
Argument data types |
Type | normal

postgres=# df *resume*
List of functions

-[ RECORD 1 ]-------+----------------------
Schema | pg_catalog

Name | pg_wal_replay_resume
Result data type | void
Argument data types |
Type | normal

We can call the SELECT pg_wal_replay_pause(); command to halt WAL replay until we call the SELECT pg_wal_replay_resume(); command.

The idea is to figure out how much WAL has already been replayed and to continue as necessary. However, keep this in mind: once a server has been promoted, we cannot just continue to replay WAL without further precautions.

As we have already seen, it can be pretty tricky to figure out how far we need to recover. Therefore, PostgreSQL provides us with some help. Consider the following real-world example: at midnight, we are running a nightly process that ends at some point that is usually not known. The goal is to recover exactly to the end of the nightly process. The trouble is this: how do we know when the process has ended? In most cases, this is hard to figure out. So, why not add a marker to the transaction log, as follows:

postgres=# SELECT pg_create_restore_point('my_daily_process_ended');
pg_create_restore_point
-------------------------
1F/E574A7B8
(1 row)

If our process calls this SQL statement as soon as it ends, it will be possible to use this label in the transaction log to recover exactly to this point in time by adding the following directive to the recovery.conf file:

recovery_target_name = 'my_daily_process_ended'

Using this setting instead of recovery_target_time, the replay process will us beamexactly to the end of the nightly process.

Of course, we can also replay up to a certain transaction ID. However, in real life, this has proven to be difficult as the exact transaction ID is rarely ever known to the administrator, and therefore, there is not much practical value in this.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset