Defining what and how to log

After taking a look at some basic settings, it's time to decide what to log. By default, only errors will be logged. However, this may not be enough. In this section, you will learn about what can be logged and what a logline will look like.

By default, PostgreSQL doesn't log information about checkpoints. The following setting is here to change exactly that:

#log_checkpoints = off 

The same applies to connections; whenever a connection is established or properly destroyed, PostgreSQL can create log entries:

#log_connections = off 
#log_disconnections = off 

In most cases, it doesn't make sense to log connections since extensive logging significantly slows down the system. Analytical systems won't suffer much. However, OLTP may be seriously impacted.

If you want to see how long statements take, consider switching the following setting to on:

#log_duration = off 

Let's move on to one of the most important settings. We haven't defined the layout of the messages yet, and so far, the log files contain errors in the following form:

test=# SELECT 1/0;
psql: ERROR: division by zero

The log will state ERROR, along with the error message. Before PostgreSQL 10.0, there wasn't a timestamp, username, and so on. You had to change the value immediately to make any sense of the logs. In PostgreSQL 10.0, the default value has changed to something much more reasonable. To change that, take a look at log_line_prefix:

#log_line_prefix = '%m [%p] '  
# special values: # %a = application name # %u = user name # %d = database name # %r = remote host and port # %h = remote host # %p = process ID # %t = timestamp without milliseconds # %m = timestamp with milliseconds # %n = timestamp with milliseconds (as a Unix epoch) # %i = command tag # %e = SQL state # %c = session ID # %l = session line number # %s = session start timestamp # %v = virtual transaction ID # %x = transaction ID (0 if none) # %q = stop here in non-session processes # %% = '%'

Furthermore, log_line_prefix is pretty flexible and allows you to configure the logline to exactly match your needs. In general, it is a good idea to log a timestamp. Otherwise, it is close to impossible to see when something bad has happened. Personally, I also like to know the username, the transaction ID, and the database. However, it is up to you to decide what you really need.

Sometimes, slowness is caused by bad locking behavior. Users blocking each other can cause bad performance, and it is important to sort out those issues to ensure high throughput. In general, locking-related issues can be hard to track down.

Basically, log_lock_waits can help detect such issues. If a lock is held longer than deadlock_timeout, then a line will be sent to the log, provided that the following configuration variable is turned on:

#log_lock_waits = off

Finally, it's time to tell PostgreSQL what to actually log. So far, only errors, slow queries, and the like have been sent to the log. However, log_statement has four possible settings, as shown in the following block:

#log_statement = 'none' 
      # none, ddl, mod, all 

Note that none means that only errors will be logged. ddl means that errors, as well as DDLs (CREATE TABLE, ALTER TABLE, and so on), will be logged. mod will already include data changes, and all will send every statement to the log.

Be aware that all can lead to a lot of logging information, which can slow down your system. To show you how much of an impact there can be, I have compiled a blog post. It can be found at https://www.cybertec-postgresql.com/en/logging-the-hidden-speedbrakes/.

If you want to inspect replication in more detail, consider turning the following setting on:

#log_replication_commands = off 

This will send replication-related commands to the log.

For more information on replication, visit the following website: https://www.postgresql.org/docs/current/static/protocol-replication.html.

It can frequently be the case that performance problems are caused by temporary file I/O. To see which queries cause problems, the following setting can be used:

#log_temp_files = -1    
# log temporary files  equal  or larger 
# than  the specified size  in kilobytes; 
# -1 disables, 0 logs  all temp  files 

While pg_stat_statements contains aggregated information, log_temp_files will point to specific queries causing issues. It usually makes sense to set this one to a reasonably low value. The correct value depends on your workload, but maybe 4 MB is a good start.

By default, PostgreSQL will write log files in the time zone where the server is located. However, if you are running a system that is spread all over the world, it can make sense to adjust the time zone in a way so that you can go and compare log entries, as shown in the following code:

log_timezone = 'Europe/Vienna'

Keep in mind that, on the SQL side, you will still see the time in your local time zone. However, if this variable is set, log entries will be in a different time zone.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset