Inspecting locks

In many cases, locks are a major issue for most system administrators. If a transaction blocks some other operations, it is usually a problem. End users will complain that "something hangs." However, you can rely on the fact that PostgreSQL never hangs without a reason. Usually, it is simple and all about locking.

In this section you will learn how to detect locks and how to figure out who is locking which transaction.

Let's do a simple example considering the following data:

test=# CREATE TABLE t_test AS SELECT * 
   FROM generate_series(1, 10) AS id;
SELECT 10

The goal is to have two concurrent SELECT FOR UPDATE operations reading slightly different data. Those two operations will lock each other, and then the goal will be to figure out which operation is blocking which transaction.

The first SELECT FOR UPDATE command selects everything larger than 9:

test=# BEGIN;
BEGIN
test=# SELECT * FROM t_test WHERE id > 9 FOR UPDATE;
 id 
----
 10
(1 rows)

The second transaction locks everything larger than 8:

test=# BEGIN;
BEGIN
test=# SELECT * FROM t_test WHERE id > 8 FOR UPDATE;

Logically, this transaction has to wait for the first transaction. In this case, it is obvious what is going on, but how can it be checked if you don't know?

The first thing to check is always pg_stat_activity. It will reveal what is going on and which operations are active:

test=# x
Expanded display is on.
test=# SELECT pid, query, waiting 
  FROM pg_stat_activity;[ RECORD1 ]--------------------------------------------
pid     | 4635
query   | SELECT * FROM t_test WHERE id > 9 FOR UPDATE;
waiting | f
[ RECORD2 ]--------------------------------------------
pid     | 4637
query   | SELECT * FROM t_test WHERE id > 8 FOR UPDATE;
waiting | t
[ RECORD3 ]--------------------------------------------
pid     | 4654
query   | SELECT pid, query, waiting 
    FROM pg_stat_activity;
waiting | f

In this example, pid 4637 indicates that the transaction is waiting on something. The waiting=true is a reliable indicator of potential trouble. It means that a transaction is stuck. The next thing to do is to check an additional system view called pg_locks view. This view tells us which locks have been granted and which locks haven't been.

The following query detects locks that have not been granted:

test=# SELECT locktype, granted, transactionid, 
    mode 
  FROM pg_locks WHERE granted = 'f';
   locktype    | granted | transactionid |   mode    
---------------+---------+---------------+-----------
 transactionid | f       |          1812 | ShareLock
(1 row)

It seems that somebody is waiting on transactionid 1812. This is an important observation. In the next step, information about transaction 1812 can be retrieved:

test=# SELECT granted, transactionid, mode, pid 
  FROM   pg_locks 
  WHERE   transactionid = 1812;
 granted | transactionid |     mode      | pid  
---------+---------------+---------------+------
 t       |          1812 | ExclusiveLock | 4635
 f       |          1812 | ShareLock     | 4637
(2 rows)

There are two processes that have something to do with transactionid 1812. One of them has the lock and the other does not. Obviously, pid 4635 is blocking pid 4637. To solve this problem, the database connection that is causing issues can be terminated:

SELECT pg_terminate_backend(4635);

But don't do that yet. There is more to be found out.

Whenever a lock is held on a conflicting row, there is an entry made in pg_locks:

test=# SELECT pid, relation, page, tuple 
    FROM   pg_locks 
    WHERE   page IS NOT NULL;
 pid  | relation | page | tuple 
------+----------+------+-------
 4637 |    16385 |    0 |    10
(1 row)

Obviously, there is a lock on the tenth tuple in the first (0) page in relation 16385. The pid indicates that the waiting transaction is waiting on this row.

In many cases, it is even possible to see the row that is causing issues. The so-called ctid is an identifier pointing to a row in the storage system. By querying the row, we have good chances (no guarantee) of seeing the row that is causing issues:

test=# SELECT ctid, * FROM t_test 
  WHERE   ctid = '(0, 10)';
  ctid  | id 
--------+----
 (0,10) | 10
(1 row)

The problem is caused by row number 10, which is the very row locked by our first SELECT FOR UPDATE command. Inspecting the row that is causing issues can give you vital insights into what is really going on in the system. Before killing a database connection, it is always wise to check out which kind of data is really causing trouble. Sometimes, this can reveal problems in the underlying application.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset