Facing clog corruption

PostgreSQL has a thing called the commit log (now called pg_xact; it was formally known as pg_clog). It tracks the state of every transaction on the system, and helps PostgreSQL determine whether a row can be seen or not. In general, a transaction can be in four states:

#define TRANSACTION_STATUS_IN_PROGRESS    0x00
#define TRANSACTION_STATUS_COMMITTED 0x01
#define TRANSACTION_STATUS_ABORTED 0x02
#define TRANSACTION_STATUS_SUB_COMMITTED 0x03

The clog has a separate directory in the PostgreSQL database instance (pg_xact).

In the past, people have reported something called clog corruption, which can be caused by faulty disks or bugs in PostgreSQL that have been fixed over the years. A corrupted commit log is a pretty nasty thing to have, because all of our data is there, but PostgreSQL does not know whether things are valid or not anymore. Corruption in this area is nothing short of a total disaster.

How does the administrator figure out that the commit log is broken? Here is what we normally see:

ERROR: could not access status of transaction 118831

If PostgreSQL cannot access the status of a transaction, issues will occur. The main question is—how can this be fixed? To put it to you straight, there is no way to really fix the problem—we can only try and rescue as much data as possible.

As we've stated already, the commit log keeps two bits per transaction. This means that we have four transactions per byte, leaving us with 32,768 transactions per block. Once we have figured out which block it is, we can fake the transaction log:

dd if=/dev/zero of=<data directory location>/pg_clog/0001 
bs=256K count=1

We can use dd to fake the transaction log and set the commit status to the desired value. The core question is really—which transaction state should be used? The answer is that any state is actually wrong, because we really don't know how those transactions ended.

However, usually, it is a good idea to just set them to committed in order to lose less data. It really depends on our workload and our data, when deciding what is less disruptive.

When we have to use this technique, we should fake as little clog as is necessary. Remember, we are essentially faking the commit status, which is not a nice thing to do to a database engine.

Once we are done faking the clog, we should create a backup as fast as we can, and recreate the database instance from scratch. The system we are working with is not very trustworthy anymore, so we should try and extract data as fast as we can. Keep this in mind: the data we are about to extract could be contradictory and wrong, so we will make sure that some quality checks are imposed on whatever we are able to rescue from our database server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset