Performance testing

An important question regarding a database system is: how fast is it? How many transactions can it handle per second, or how much time does a particular query take to execute? The topic on the performance of a database has been covered in Chapter 13, Optimizing Database Performance. Here, we will only discuss the task of measuring it.

The psql meta-command iming is used to measure the time of execution of a particular SQL command. Once timing is enabled, psql shows the execution time for each command:

car_portal=> 	iming
Timing is on.
car_portal=# SELECT count(*) FROM car_portal_app.car;
count
-------
229
(1 row)
Time: 0.643 ms

Usually, that is enough to understand which query is faster and if you are doing it right when optimizing a query. However, one cannot rely on this timing when it comes to estimating the number of requests that the server can handle per second. This is because the time for a single query depends on many random factors: the current load of the server, the state of the cache, and so on.

PostgreSQL provides a special utility that connects to the server and runs a test script many times. It is called pgbench. By default, pgbench creates its own small database and executes a sample SQL script on it, which is quite similar to what a typical OLTP (Online transaction processing) application usually does. This is already enough to understand how powerful the database server is, and how changes to the configuration parameters affect the performance.

To get more specific results, one should prepare a test database that has a size comparable to the database in production. A test script, which contains the same or similar queries that the production system performs, should also be prepared.

For example, it is assumed that the car portal database is used by a web application. The typical usage scenario is querying the car table to get the number of records, and then querying it again to retrieve the first 20 records, which fit into a page on the screen. The following is the test script:

SELECT count(*) FROM car_portal_app.car;
SELECT * FROM car_portal_app.car INNER JOIN car_portal_app.car_model
USING (car_model_id) ORDER BY car_id LIMIT 20;

It is saved in a file called test.sql, available in the attached media, which will be used by pgbench.

It is necessary to initialize the sample data for pgbench (assuming that the database is running on the same machine and the current user can access the database):

user@host:~$ pgbench -h localhost -i car_portal
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.17 s, remaining 0.00
s).
vacuum...
set primary keys...
done.

The test can now be started, as follows:

user@host:~$ pgbench -h localhost -f test.sql -T 60 car_portal
starting vacuum...end.
transaction type: test.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 98314
latency average = 0.610 ms
tps = 1638.564670 (including connections establishing)
tps = 1638.887817 (excluding connections establishing)

As you can see, the performance is about 1.6 thousand transactions per second, which is more than enough for a small car portal application. However, pgbench was running on the same machine as the database server, and they shared the same CPU. On the other hand, the network latency was minimal.

To get more realistic statistics, pgbench should be started on the machine where the application server is installed, and the number of connections in use should match the configuration of the application server. Usually, for such simple queries, the network connection and transmit times play a bigger role than database processing.

pgbench allows a user to specify the number of connections it establishes to the database server. Moreover, it provides the functionality to generate random queries to the database. So, it can be quite useful to assess database performance and tune database server configurations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset