Partitioning

Data is constantly or periodically loaded into a data warehouse. The database can grow very big. The bigger it gets, the slower it works. The size of the database is limited by the capacity of the disk storage, so the needs to be deleted from time to time. Deletion from a very big table can also be quite slow.

The data that is newer is usually queried more often. Business users could check the reports of the last day every morning, of the last week every Monday, and of the month at the beginning of the next month. It is common to compare results of a time period with a corresponding previous period. For example, the current month as compared to the previous month, or to the respective month one year ago. It is unlikely that somebody would query data that is ten years old.

It would be nice to keep the newer, more queried data in one table that is relatively small and the old data in a different table or tables, and query only the table that has the data for a report that is required. On the other hand, it would be quite complex to implement a reporting tool that would query different tables depending on the parameters of the query.

PostgreSQL provides a way of keeping data in different tables and using a common name when querying them. This is called partitioning. Partitioning is implemented through the mechanism of table inheritance. This was mentioned in Chapter 03, PostgreSQL Basic Building Blocks. When a table inherits another table or tables, it is called a child table. The table that is inherited is a parent table. When a parent table is queried, the data from all the child tables is returned. In the context of partitioning, child tables are called partitions.

To define partitions for a table, it is necessary to choose a partition key. The partition key is a field or expression (or list of them) whose value will define to which partition a record belongs. In PostgreSQL, it is possible to create two different partitioning schemes: range and list. Range means that all the values within a certain range will belong to a partition. List means that only a specific value or values will belong to a partition.

Let's now define a partitioned data structure to store the data for the HTTP access log we imported in the previous section. We will only use a subset of fields for simplicity.

Firstly, create the parent table, as follows:

CREATE TABLE dwh.access_log_partitioned (ts timestamptz, url text, status_code int) 
PARTITION BY RANGE (ts);

This creates a table and defines the field ts as the partition key for a range partitioning scheme.

Now create partitions for ranges of values of the ts field for July, August, and September 2017, as follows:

CREATE TABLE dwh.access_log_2017_07 PARTITION OF dwh.access_log_partitioned 
FOR VALUES FROM ('2017-07-01') TO ('2017-08-01');

CREATE TABLE dwh.access_log_2017_08 PARTITION OF dwh.access_log_partitioned 
FOR VALUES FROM ('2017-08-01') TO ('2017-09-01');

CREATE TABLE dwh.access_log_2017_09 PARTITION OF dwh.access_log_partitioned 
FOR VALUES FROM ('2017-09-01') TO ('2017-10-01');

The lower bound for a range partition is inclusive and the upper bound is exclusive. In the preceding example, the partition access_log_2017_07 will contain the records that satisfy the predicate ts >= '2017-07-01' AND ts < '2017-08-01'.

When such a structure is defined, it is possible to insert data into the parent table access_log_partitioned. Records will be automatically distributed into the correct partitions depending on the values of the partition key.

What if somebody tries to insert a record where the value of the partition key is outside any of the existing partition ranges? PostgreSQL will return an error, as follows:

car_portal=> INSERT INTO dwh.access_log_partitioned values ('2017-02-01', '/test', 404);
 ERROR: no partition of relation "access_log_partitioned" found for row
 DETAIL: Partition key of the failing row contains (ts) = (2017-02-01 00:00:00+00).

To make it possible to insert such rows, one would need to create another partition for February. Alternatively, it is possible to create a partition for all rows over some value or below some value. To do this, the keywords MAXVALUE and MINVALUE are used. This command will create a partition for all the records before July 1, 2017:

CREATE TABLE dwh.access_log_min PARTITION OF dwh.access_log_partitioned 
FOR VALUES FROM (MINVALUE) TO ('2017-07-01');

The partition scheme by list can be used with fields that do not have too many distinct values that are known in advance. For example, for the HTTP status code, it is possible to combine those partitioning schemes by creating sub-partitions for existing partitions, as follows:

CREATE TABLE dwh.access_log_2017_10 PARTITION OF dwh.access_log_partitioned 
FOR VALUES FROM ('2017-10-01') TO ('2017-11-01')
PARTITION BY LIST (status_code);

CREATE TABLE dwh.access_log_2017_10_200 PARTITION OF dwh.access_log_2017_10 FOR VALUES IN (200);

CREATE TABLE dwh.access_log_2017_10_400 PARTITION OF dwh.access_log_2017_10 FOR VALUES IN (400);

We have now a partition for October that has two sub-partitions for the HTTP status codes 200 and 400. Note that it will not be possible to insert other status codes into the table with such a configuration. It is however possible to give a list of values of the partition key when creating a partition.

Alternatively, it is possible to define the partitioning scheme not just on a field, but on an expression. Here, partitions are created based on the first digit of the status code:

CREATE TABLE dwh.access_log_2017_11 PARTITION OF dwh.access_log_partitioned 
FOR VALUES FROM ('2017-11-01') TO ('2017-12-01')
PARTITION BY LIST (left(status_code::text, 1));

CREATE TABLE dwh.access_log_2017_11_2XX PARTITION OF dwh.access_log_2017_11 FOR VALUES IN ('2');

CREATE TABLE dwh.access_log_2017_11_4XX PARTITION OF dwh.access_log_2017_11 FOR VALUES IN ('4');

More complicated partitioning strategies are still possible using the traditional table inheritance feature. This will require some manual effort to create tables, define an inheritance relationships, create check constraints to make PostgreSQL aware of which data is stored in which partition when necessary, and set up triggers or rules to distribute the records into proper partitions when inserted into the parent table.

It's possible to delete a partition by simply deleting the table. It is also possible to detach a partition from the parent table so that it is not a partition anymore but just a standalone table:

car_portal=> ALTER TABLE dwh.access_log_partitioned DETACH PARTITION dwh.access_log_2017_11;
ALTER TABLE

It is also possible to make an existing table a partition of some other table:

car_portal=> ALTER TABLE dwh.access_log_partitioned ATTACH PARTITION dwh.access_log_2017_11 
FOR VALUES FROM ('2017-11-01') TO ('2017-12-01');
ALTER TABLE

To illustrate the benefits of partitioning, let's see an example. First, create a non-partitioned table to compare the results, with the same structure as the partitioned one:

car_portal=> CREATE TABLE dwh.access_log_not_partitioned (LIKE dwh.access_log_partitioned);
CREATE TABLE

Insert the data into both the partitioned and non-partitioned tables, duplicating the contents of the table dwh.access_log 1,000 times, as follows:

car_portal=> INSERT INTO dwh.access_log_not_partitioned SELECT ts, url, status_code FROM dwh.access_log, generate_series(1, 1000);
INSERT 0 15456000
car_portal=> INSERT INTO dwh.access_log_partitioned SELECT ts, url, status_code FROM dwh.access_log, generate_series(1, 1000);
INSERT 0 15456000

Now let's count the number of records for the last 10 days of August, measuring the time the query takes, as follows:

car_portal=> 	iming
Timing is on.

car_portal=> SELECT count(*) FROM dwh.access_log_not_partitioned 
  WHERE ts >= '2017-08-22' AND ts < '2017-09-01';
 count 
---------
 1712000
(1 row)
Time: 921.122 ms

car_portal=> SELECT count(*) FROM dwh.access_log_partitioned 
  WHERE ts >= '2017-08-22' AND ts < '2017-09-01';
 count 
---------
 1712000
(1 row)
Time: 336.221 ms

Querying the partitioned table was about three times quicker. That is because PostgreSQL knows which partition has the data for August and only scans that one. This is also visible in the execution plan:

car_portal=> EXPLAIN SELECT count(*) FROM dwh.access_log_partitioned 
  WHERE ts >= '2017-08-22' AND ts < '2017-09-01';
 QUERY PLAN 
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate (cost=82867.96..82867.97 rows=1 width=8)
  -> Gather (cost=82867.75..82867.96 rows=2 width=8)
   Workers Planned: 2
    -> Partial Aggregate (cost=81867.75..81867.76 rows=1 width=8)
      -> Append (cost=0.00..80149.89 rows=687141 width=0)
        -> Parallel Seq Scan on access_log_2017_08 (cost=0.00..80149.89 rows=687141 width=0)
           Filter: ((ts >= '2017-08-22 00:00:00+00'::timestamp with time zone) AND (ts < '2017-09-01     
           00:00:00+00'::timestamp with time zone))
(7 rows)

The sequential scan is performed on the partition table access_log_2017_08.

Partitioning is described in detail in the PostgreSQL documentation at https://www.postgresql.org/docs/current/static/ddl-partitioning.html.

Table of Contents for Partitioning

Create new playlist

Sign In

Sign Up

Table of Contents for
Partitioning