Handling various formats

So far, we have seen that pg_dump can be used to create text files. The problem here is that a text file can only be replayed completely. If we have saved an entire database, we can only replay the entire thing. In most cases, this is not what we want. Therefore, PostgreSQL has additional formats that offer more functionality.

At this point, four formats are supported:

-F, --format=c|d|t|p  output file  format (custom, directory, tar, plain  text  (default))  

We have already seen plaintext, which is just normal text. On top of that, we can use a custom format. The idea behind a custom format is to have a compressed dump, including a table of contents. Here are two ways to create a custom format dump:

[hs@linuxpc ~]$ pg_dump -Fc test > /tmp/dump.fc
[hs@linuxpc ~]$ pg_dump -Fc test -f /tmp/dump.fc 

In addition to the table of contents, the compressed dump has one more advantage: it is a lot smaller. The rule of thumb is that a custom format dump is around 90% smaller than the database instance you are about to back up. Of course, this is highly dependent on the number of indexes, but for many database applications, this rough estimation will hold true.

Once the backup is created, we can inspect the backup file:

[hs@linuxpc ~]$ pg_restore --list /tmp/dump.fc
;
; Archive created at 2018-11-04 15:44:56 CET
;   dbname: test
;   TOC Entries: 18
;   Compression: -1
;   Dump Version: 1.12-0
;   Format: CUSTOM
;   Integer: 4 bytes
;   Offset: 8 bytes
;   Dumped from database version: 12.0
;   Dumped by pg_dump version: 12.0
;
; Selected TOC Entries:
;
3103;  1262  16384  DATABASE - test  hs
3; 2615  2200  SCHEMA - public hs
3104;  0 0 COMMENT - SCHEMA public hs
1; 3079  13350  EXTENSION - plpgsql
3105;  0 0 COMMENT - EXTENSION plpgsql
187;  1259  16391  TABLE  public t_test hs
...

Note that pg_restore --list will return the table of contents of the backup.

Using a custom format is a good idea as the backup will shrink in size. However, there's more: the -Fd command will create a backup in the directory format. Instead of a single file, you will now get a directory containing a couple of files:

[hs@linuxpc ~]$ mkdir /tmp/backup
[hs@linuxpc ~]$ pg_dump -Fd test -f /tmp/backup/
[hs@linuxpc ~]$ cd /tmp/backup/
[hs@linuxpc backup]$ ls -lh total  86M
-rw-rw-r--. 1 hs hs   85M Jan   4 15:54  3095.dat.gz
-rw-rw-r--. 1 hs hs   107 Jan   4 15:54  3096.dat.gz
-rw-rw-r--. 1 hs hs 740K  Jan   4 15:54  3097.dat.gz
-rw-rw-r--. 1 hs hs   39 Jan   4 15:54  3098.dat.gz
-rw-rw-r--. 1 hs hs 4.3K  Jan   4 15:54  toc.dat

One advantage of the directory format is that we can use more than one core to perform the backup. In the case of a plain or custom format, only one process will be used by pg_dump. The directory format changes that rule. The following example shows how we can tell pg_dump to use four cores (jobs):

[hs@linuxpc backup]$ rm -rf *
[hs@linuxpc backup]$ pg_dump -Fd test -f /tmp/backup/ -j 4  
The more objects in our database, the more of a chance there is for a potential speedup.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset