Back in the early days of PostgreSQL projects, people had to run VACUUM manually. Fortunately, those days are long gone. Nowadays, administrators can rely on a tool called autovacuum, which is part of the PostgreSQL server infrastructure. It automatically takes care of cleanup and works in the background. It wakes up once per minute (see autovacuum_naptime = 1 in postgresql.conf) and checks whether there is work to do. If there is work, autovacuum will fork up to three worker processes (see autovacuum_max_workers in postgresql.conf).
The main question is, when does autovacuum trigger the creation of a worker process?
The answer to this question can, again, be found in postgresql.conf, as shown in the following code:
autovacuum_vacuum_threshold = 50 autovacuum_analyze_threshold = 50 autovacuum_vacuum_scale_factor = 0.2 autovacuum_analyze_scale_factor = 0.1
The autovacuum_vacuum_scale_factor command tells PostgreSQL that a table is worth vacuuming if 20% of its data has been changed. The trouble is that if a table consists of one row, one change is already 100%. It makes absolutely no sense to fork a complete process to clean up just one row. Therefore, autovacuum_vacuuum_threshold says that we need 20%, and this 20% must be at least 50 rows. Otherwise, VACUUM won't kick in. The same mechanism is used when it comes to the creation of optimizer statistics. We need 10% and at least 50 rows to justify new optimizer statistics. Ideally, autovacuum creates new statistics during a normal VACUUM to avoid unnecessary trips to the table.