Calculating checksums is the best way to be sure that a server and a slave contain exactly the same data as the server. This check can be used in two situations:
The second case is more complicated, because during the normal execution slaves can lag behind their masters. However, a tool explained next is able to perform this check automatically, by waiting until slaves reach a certain binary log event.
There are at least three methods to do this:
CHECKSUM TABLE
statementpt-table-checksum
toolSometimes we only want to check a relatively small subset of data. If so, instead of checking the whole tables, we can write a query that returns that data and calculate the MD5 checksum of the result set.
This statement returns the checksums for one or more tables. It has the following syntax:
CHECKSUM TABLE <table_list> [ QUICK | EXTENDED ]
The QUICK
option only takes effect with MyISAM and Aria tables. These storage engines calculate a live checksum for the tables that have been created with the CHECKSUM
or TABLE_CHECKSUM
option set to 1
. With the QUICK
option, CHECKSUM TABLE
returns the live value.
The EXTENDED
option calculates a checksum of the table by reading each individual row. This can be very slow.
If no option is specified, QUICK
takes effect.
Some storage engines do not support this statement. In this case, NULL
is returned.
If the checksum is 0
, the table is empty. Consider the following example:
MariaDB [test]> CHECKSUM TABLE customers, orders, products; +----------------+------------+ | Table | Checksum | +----------------+------------+ | test.customers | 0 | | test.orders | 2720624778 | | test.products | 3036305396 | +----------------+------------+ 3 rows in set (0.00 sec)
Similar to other tools that we have already discussed, pt-table-checksum
is included in the Percona Toolkit suite. Its purpose is to report a reliable checksum and the number of rows in each table. It does this without slowing down the server too much, even with big databases.
The pt-table-checksum
tool calculates a checksum for each table. Tables are never read with a long-running locking query. The pt-table-checksum
tool uses relatively small queries to divide tables in smaller subsets. Based on the server's response times, pt-table-checksum
composes queries that are not too heavy for its current workload. Moreover, it sets @@innodb_lock_wait
to 1 second at session level, so that it disconnects when a table is locked by another session for a long time.
By default, pt-table-checksum
also detects running slaves and connects to them to execute the checksums. The tool periodically executes a SHOW PROCESSLIST
statement to monitor the connected slaves. If some of them lag behind the master or disconnect, pt-table-checksum
waits until they recover. For this reason, this tool is probably the best way to periodically check the data integrity of the running servers.
After calculating the checksum for one table, the tool connects to the slaves and calculates the same checksum to verify that the tables are identical. Then, it prints out the checksum and the number of rows. It does not begin another table's checksum until this work is finished.
While
pt-table-checksum
has good fault tolerance, it could sometimes stop because of an error it cannot handle. In this case, it is possible to restart it with the --resume
option specified, so that the work it already did is not lost.
Calculating file checksums is a good way to check that the copy of a physical backup worked properly, before starting a slave. Linux systems usually include the md5sum
command, which reports the checksum of one or more files. For example:
root@this:/usr/local/mysql/data/open_fatture# md5sum --binary fornitori.frm fornitori.ibd 4582a1f51dea7980cb739b1a055d3ba7 *fornitori.frm de8ffec76f0303d0c129a536e015e14d *fornitori.ibd
The --binary
option informs md5sum
that the specified files contain binary data by default, or if the --text
option is specified. It treats the file contents as texts. This is only useful with CONNECT
and CSV
tables, and logfiles.
In some cases, we can write a query that only returns the recently inserted rows, and perhaps the ones that are modified recently. We can use such queries to quickly check that no error occurred while replicating recent data. To do this, we can use again the md5sum
Linux program.
Here is an example:
root@this:/usr/local/mysql# bin/mysql -uroot -proot --execute="SELECT * FROM gest_pescara.orders WHERE o_time > NOW() - INTERVAL 2 DAY;" | md5sum a50786099fb580c0dcb564323103bee2 -