Living in a world without TCP

In some cases, we might not want to use a network. It often happens that a database will only talk to a local application anyway. Maybe our PostgreSQL database has been shipped along with our application, or maybe we just don't want the risk of using a network; in this case, Unix sockets are what you need. Unix sockets are a network-free means of communication. Your application can connect through a Unix socket locally without exposing anything to the outside world.

What we need, however, is a directory. By default, PostgreSQL will use the /tmp directory. However, if more than one database server is running per machine, each one will need a separate data directory to live in.

Apart from security, there are various reasons why not using a network might be a good idea. One of these reasons is performance. Using Unix sockets is a lot faster than going through the loopback device (127.0.0.1). If that sounds surprising, don't worry; it is for many people. However, the overhead of a real network connection should not be underestimated if you are only running very small queries.

To depict what this really means, I have included a simple benchmark.

We will create a script.sql file. This is a simple script that creates a random number and selects it. It is the most simplistic statement possible. There is nothing simpler than fetching a number.

So, let's run this simple benchmark on a normal laptop. To do so, we shall write a small thing called script.sql. It will be used by the following benchmark:

[hs@linuxpc ~]$ cat /tmp/script.sql
SELECT 1  

Then, we can simply run pgbench to execute the SQL over and over again. The -f option allows us to pass the name of the SQL to the script. -c 10 means that we want 10 concurrent connections to be active for 5 seconds (-T 5). The benchmark is running as the postgres user and is supposed to use the postgres database, which should be there by default. Note that the following examples will work on Red Hat Enterprise Linux (RHEL) derivatives. Debian-based systems will use different paths:

[hs@linuxpc ~]$ pgbench -f /tmp/script.sql 
-c 10 -T 5
-U postgres postgres 2>
/dev/null transaction type: /tmp/script.sql

scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 5 s

number of transactions actually processed: 871407
latency average = 0.057 ms

tps = 174278.158426 (including connections establishing)
tps = 174377.935625 (excluding connections establishing)

As we can see, no hostname is passed to pgbench, so the tool connects locally to the Unix socket and runs the script as fast as possible. On this four-core Intel box, the system was able to achieve around 174,000 transactions per second.

What happens if the -h localhost is added? The performance will change, as you can see in the next code snippet:

[hs@linuxpc ~]$ pgbench -f /tmp/script.sql
-h localhost -c 10 -T 5
-U postgres postgres 2>
/dev/null transaction type: /tmp/script.sql

scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 5 s

number of transactions actually processed: 535251
latency average = 0.093 ms
tps = 107000.872598 (including connections establishing)
tps = 107046.943632 (excluding connections establishing)

The throughput will drop like a stone to 107000 transactions per second. The difference is clearly related to the networking overhead.

By using the -j option (the number of threads assigned to pgbench), we can squeeze some more transactions out of our systems. However, it does not change the overall picture of the benchmark in our situation. In other tests, it does, because pgbench can be a real bottleneck if you don't provide enough CPU power.

As we can see, networking can not only be a security issue, but also a performance issue.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset