In this section we will see which configuration parameters are required on the master and slave servers.
All the following explanations also apply to the multisource replication environments. However, there are some differences that will be explained after the common tasks.
Setting up a replication environment requires at least the following steps:
These tasks, along with other useful topics, will be explained in this section.
When setting up a replication environment, the first thing to do is of course to set up at least one master. This is an easy task, as the master is just a normal MariaDB server that has a unique ID and maintains a binary log.
First, a server ID needs to be set on all masters and slaves. The server ID needs to be unique. It must be an integer value of 4 bytes with the minimum value 1
. If the server ID is not set, or it is set to 0
, the replication is disabled.
Also, as explained previously, the binary log must be enabled on the master server. This is needed to record the events that will be sent to the slaves.
A master's configuration file needs to contain lines similar to the following:
server-id=1 log_bin="binlog" binlog_format=STATEMENT
The @@server_id
and @@binlog_format
variables are dynamic; thus they can simply be changed at runtime without stopping the server:
MariaDB [(none)]> SET @@server_id = 1; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> SET @@binlog_format = 'MIXED'; Query OK, 0 rows affected (0.03 sec)
The @@log_bin
variable is not dynamic, so enabling the binary log requires a restart.
Then, we need to create at least one account for replication slaves. Strictly speaking, a slave only needs the REPLICATION SLAVE
privilege to work properly. If some databases must not be replicated, this permission should not be granted on them. It is possible for all the slaves to use the same user; they could even use an account that is shared with other clients. However, since the password is stored in clear text in a file called master.info
, it is generally better for them to use different passwords. Also, to improve the security of data, each slave user should only be able to connect from a specific hostname. It could also be a good idea to force the slaves to connect using SSL.
Here is an example that shows how to create a secure replication account:
MariaDB [test]> CREATE USER 'mslave1'@'host10' IDENTIFIED BY 'somepwd'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> GRANT REPLICATION SLAVE ON *.* TO 'mslave1'@'host10' REQUIRE SSL; Query OK, 0 rows affected (0.00 sec)
Note that the REPLICATION SLAVE
permission must not be confused with REPLICATION CLIENT
. This permission should be granted to the user who performs replication configuration and diagnostics, because it allows you to execute the SHOW MASTER STATUS
and SHOW SLAVE STATUS
statements. On production servers, root
should not be used unless we really need all the permissions for the current session.
For details about security and user management, see Chapter 5, Users and Connections.
After setting up one or more master, we need to properly configure the slaves. This step is required when setting up a replication environment, and every time we want to add a new slave.
Similarly to the masters, each slave needs to have a unique server ID.
It is not necessary to enable the binary logging on the slaves, unless a slave should also act as a master of another slave. If this is not the case, the binary log can still be useful for taking backups, but it affects the performance.
The following example shows the most minimalist configuration settings required for a slave to work, in the configuration file:
server-id=2
The slave might also act as a master of one or more slaves. In this case, the slave needs to log into the binary log the events that it receives from its masters, so that its slaves will be able to retrieve them. However, the replicated events are not logged by default. To replicate them, we must enable the binary log and set the @@log_slave_updates
server variable to ON
. This variable is not dynamic, so changing its value requires a server restart.
The following example shows a minimalist configuration for a slave that also acts as a server:
server-id=2 log_bin="binlog" binlog_format=STATEMENT log_slave_updates=ON
If a master has no data yet, setting up a replication between the master and a slave is very straightforward. After starting the master and the slave as explained previously, we will follow this procedure:
CHANGE MASTER TO
statement.The following example shows how to obtain the master's binary log coordinates:
MariaDB [(none)]> FLUSH TABLES WITH READ LOCK; Query OK, 0 rows affected (0.04 sec) MariaDB [(none)]> SHOW MASTER STATUS; +---------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +---------------+----------+--------------+------------------+ | binlog.000031 | 323 | | | +---------------+----------+--------------+------------------+ 1 row in set (0.02 sec) MariaDB [(none)]> UNLOCK TABLES; Query OK, 0 rows affected (0.00 sec)
In this case, the coordinates are the binlog.000031
logfile and the position 323
. The slave will start to replicate the data from these coordinates.
The CHANGE MASTER TO
statement can be used to tell the slave which master it has to replicate. The following basic example shows how to do this:
MariaDB [(none)]> CHANGE MASTER TO -> MASTER_HOST = '162.100.100.100', -> MASTER_USER = 'slave01', -> MASTER_PASSWORD='somepwd', -> MASTER_PORT = 5000, -> MASTER_LOG_FILE = 'binlog.000031', -> MASTER_LOG_POS = 3234; Query OK, 0 rows affected (0.23 sec)
At this point, the slave knows all it has to know about replication, the necessary permissions on the master are set, and both are running. To start replication, we will just use the START SLAVE
statement:
MariaDB [(none)]> START SLAVE; Query OK, 0 rows affected (0.02 sec)
The parameters of the connection between the slave and the server can be changed in the future. However, while this is done, the slave must be temporarily stopped. For example:
MariaDB [(none)]> STOP SLAVE; Query OK, 0 rows affected, 1 warning (0.00 sec) MariaDB [(none)]> CHANGE MASTER TO MASTER_PASSWORD = 'my_new_pwd'; Query OK, 0 rows affected (0.22 sec) MariaDB [(none)]> START SLAVE; Query OK, 0 rows affected (0.00 sec)
If the server is restarted, the slave thread will be restarted too.
We can check the running slave threads with a SHOW SLAVE STATUS
query. For example:
MariaDB [(none)]> SHOW SLAVE STATUS G *************************** 1. row *************************** Slave_IO_State: Connecting to master … Slave_IO_Running: Connecting Slave_SQL_Running: Yes … Last_Error: … 2 rows in set (0.02 sec)
The output of this statement has many columns, so we truncated it. We left the columns that are important to verify that a slave is running:
Column name |
Description |
---|---|
|
The current state of the I/O thread, as shown in |
|
Shows if the I/O thread is connected and working. |
|
The current state of the SQL thread, as shown in |
|
The last error encountered by the slave. |
The Slave_IO_Running
and Slave_SQL_Running
columns should be set to Yes
. If either of them is not set to Yes
, Slave_IO_State
and Last_Error
help us identify the problem. For example, if a connection error occurs, we will see an error similar to the following:
error connecting to master '[email protected]:3310' - retry-time: 10 retries: 86400 message: Can't connect to MySQL server on '127.0.0.1' (111 "Connection refused")
Sometimes we want to reconfigure an existing slave. Probably we restored an old database into the master, or the filtering rules were not set correctly. But, whatever the reason is, we want a slave to forget the data it replicated until now and start again.
After the restart, the slave will need to replicate the master's binary log from the start. So, it must forget the current coordinates. The RESET SLAVE
statement does the trick by deleting the current slave log files. This cannot be done while the slave is running, so it needs to be temporarily stopped.
Consider the following example:
MariaDB [(none)]> STOP SLAVE; Query OK, 0 rows affected (0.17 sec) MariaDB [(none)]> RESET SLAVE; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> START SLAVE; Query OK, 0 rows affected (0.18 sec)
Sometimes, we want to replace an old master with a new one. Probably the older master was slow, or its hardware was damaged. But, whatever the reason is, we will need to load the old master's data into the new master's data.
We can choose to do this when the slaves are already running and connected to the master. In this case, we will use a dump of the data. In Chapter 8, Backup and Disaster Recovery, we discussed how to take this kind of backup using mysqldump
and how to load the dump into a server. The dump will be automatically replicated by the slaves.
An alternative to using a logical backup is copying the files. In this case, the backup will not be automatically replicated, if the slaves are already running. We will need to copy the physical backup into the slave's data
directory, too. In this way, they will be provided with the old master's data before starting to replicate the new master.
A replication environment is not meant to be static. It is possible to add slaves at any time, to obtain more redundancy or to balance the workload across a higher number of MariaDB servers. In this case, we will need to load the current master data into the new slaves. Then, the slave can start to replicate the master.
It is possible to create a physical backup and copy it into the slave's data
directories as explained in Chapter 8, Backup and Disaster Recovery. There is no difference between normal physical backups and backups used for replication.
It is also possible to use mysqldump
. This tool has already been discussed, but it has some parameters that make the dump easier to restore into a slave. The dump can be taken from the master or from a slave. The latter option is useful to avoid overloading the master.
When taking a backup from a master that must be restored into a slave, the --master-data
option is particularly useful. It adds a CHANGE MASTER TO
statement to the dump, so that slaves are automatically set up to replicate the master from the proper coordinates.
The --delete-master-logs
option executes the PURGE BINARY LOGS
statement on the master. This statement is used to delete the old binary log files and will be discussed later in this chapter.
If the master already has at least one slave, the dump can be taken from the slave to avoid the execution of the long locking queries on the master.
The main parameter that we will want to use is --dump-slave
. It is very similar to --master-data
, but they produce different CHANGE MASTER TO
statements. If we use --master-data
, mysqldump
assumes that the statement will be used to replicate data from the server it is connecting to. Instead, --dump-slave
assumes that this server is a slave, and the CHANGE MASTER TO
statement will be used to replicate its server. This difference is very important.
Note that, while --dump-slave
includes the replication coordinates in the dump, it does not include the master's hostname and port. Thus we may want to specify the --include-master-host-port
option, which ensures the MASTER_HOST
and MASTER_PORT
clauses are included in the CHANGE MASTER TO
statement. This option makes the configuration of a slave even easier, unless the dump is created on another slave, or the master's hostname or port is changed for some reason.
With --apply-slave-statements
the dump will contain STOP SLAVE
and START SLAVE
before and after CHANGE MASTER TO
. This is useful if the target slave is already running. If it is not running yet, the START SLAVE
statement will make the replication start immediately. We do not want it to start immediately if we want to test the consistency of data before the slave starts its work.
It is possible to avoid replicating some statements. This can be done on a master or on individual slaves.
The
SET SQL_LOG_BIN
statement can be used to enable or disable the logging of subsequent statements into the binary log. It only affects the current session. Statements that are not written in the master's binary log cannot be replicated by any slave.
Consider the following example:
MariaDB [test]> SET SQL_LOG_BIN = 0; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> /* this statement will not be logger or replicated */ -> DROP TABLE orders; Query OK, 0 rows affected (0.38 sec) MariaDB [test]> SET SQL_LOG_BIN = 1; Query OK, 0 rows affected (0.00 sec)
The @@skip_replication
session variable is similar to the SET SQL_LOG_BIN
statements. However it does not inhibit the logging of statements; it just causes those statements to be flagged as skip_replication
in the binary log. The slaves will receive such events. Their behavior depends on the value of the @@replicate_events_marked_for_skip
variable:
REPLICATE
: This causes the events to be replicated: the flag is simply ignored. This is the default value.FILTER_ON_SLAVE
: This causes the slaves to ignore such events. They will still receive them and log them, if the binary log is enabled. The flag is preserved on the slave's binary log.FILTER_ON_MASTER
: With this the slave will not receive the events at all.The slaves have three dynamic variables that can be used to prevent some tables, or some complete databases, from being replicated. A comma-separated list of arguments can be provided to these variables. Before changing their values, it is necessary to stop the slave. Such variables are as follows:
@@replicate_skip_db
: This prevents the replication of the specified databases. This option does not affect multidatabase statements: to prevent them, @@replicate_skip_table
must be used instead.@@replicate_skip_table
: This prevents the replication of the specified tables. The names should be specified in the following form: db_name.table_name
.@@replicate_wild_skip_tables
: This is similar to @@replicate_skip_table
, but the use of the %
and _
wildcard characters is allowed. These characters have the same meaning they have for the LIKE
operator. For example, the following value prevents the replication of any table in any database whose name starts with "test": test%.%
.As an alternative, it is possible to disallow replication for all tables and databases, except for the specified subsets. This can be done using the following variables, which are the complement of the ones discussed earlier:
@@replicate_do_db
@@replicate_do_table
@@replicate_wild_do_tables
Since MariaDB 5.3, it has been possible to write each event's checksum into the binary log. This feature is not enabled by default, because it modifies the binary log format, adding an incompatibility. To write checksums, we can set the @@binlog_checksum
variable to 1
(not ON
). Enabling this option makes the replication more reliable, but we should avoid it when performance of the slaves is a problem.
It is possible to verify the checksums in several situations:
@@master_verify_checksum
is set to 1
. By default, it is 0
.@@slave_sql_verify_checksum
is set to 1
, which is the default value.mysqlbinlog
utility (discussed in Chapter 8, Backup and Disaster Recovery) verifies the checksums if it is invoked with the --verify-binlog-checksum
option.All the variables that affect the binary log events checksums are dynamic.
Usually, a database server processes requests from several clients at the same time. As far as these requests do not lock rows or tables to guarantee data integrity, they are processed simultaneously. In Chapter 5, Users and Connections, we discussed how simultaneous connections are handled. However, before MariaDB 10.0, slaves used only one thread to replicate all the events they received from the master. Because of this limitation, the write operations were much slower on the slaves, especially in environments where a master could execute many non-blocking writes. Parallel replication solves this issue by using multiple parallel threads to apply replication events.
As mentioned earlier, parallel replication is not used by default. To enable it, we must set the
@@slave_parallel_threads
server variable to a value higher than 0
, on the master. This value is the number of parallel worker threads that will be used by all the slaves.
The following variables are only meaningful when used with parallel replication.
The @@slave_parallel_max_queued
variable determines the amount of memory that the slaves must use to cache the next not-yet-executed relay log events. When at least the worker thread is free, the slaves examine this cache looking for events that can be executed in parallel.
The @@slave_domain_parallel_threads
variable is useful when using parallel replication on a slave that replicates multiple masters. Imagine that a slave replicates three masters. Imagine that one of them (call it master1) executes a statement that takes several hours. This can happen with big tables. The slave will need to replicate this statement. Meanwhile, initially all connections will be able to allocate worker threads. But the worker threads associated to master1 will need to wait until the very long-running statement has been executed. When no more worker threads are free, the other connections will not be able to benefit from parallel replication anymore.
The purpose of @@slave_domain_parallel_threads
is to prevent a single master connection from monopolizing the pool of threads. This value determines the maximum number of threads that can be allocated by the same master connection. If the value is not lower than @@slave_parallel_threads
, it has no effect. But if it is lower than @@slave_parallel_threads
divided by the number of master connections, some worker threads will never be used. The value should be left as high as possible to avoid preventing a master connection from allocating a thread when not necessary.
In real cases, finding the optimal value for @@slave_parallel_threads
can be complex. It is highly dependent on the characteristics of the master's workloads. As a general rule, we can start with a value that is slightly lower than @@slave_parallel_threads
, and then lower the value if a problem occurs because of long-running statements.
All these variables are dynamic, so changing the configuration of the parallel replication does not require the master to be restarted. However, the slaves should be stopped temporarily.
There are several reasons why we might want to have a delayed slave. For example, we may want to have a slave with a 30 minute delay to recover from errors. Alternatively, if we accidentally drop a table, we might have 30 minutes to recover it from the slave before the deletion is replicated. Or, we may have a slave with a delay of a day, week, or month. It would allow us to compare the most recent database with an older version, to find the recent structure or data modifications.
MariaDB does not natively support delayed replication. However, Percona Toolkit contains a tool that implements this feature on the client side: it is called pt-slave-delay
. It is necessary to invoke the tool separately for each slave that needs to be delayed. This tool works by connecting to the slave and periodically checking how much the slave is lagging behind the master. When the delay is too low, pt-slave-delay
stops the slave for a while. It is possible to only cause the delay for a given period of time, or permanently.
The following options are also very important:
delay
: This specifies the desired amount of delay. The --interval
option determines the time interval between the checks.--run-time
: This determines for how much time the tool will run; if not specified, it will never terminate spontaneously. By default, when the program terminates, the slave will be restarted, if it is not running. This applies even if the program is terminated by pressing Ctrl + C. To change this behavior, and stop the slave on exit if it is running, we can use the --continue
option.The time options can be specified with a number followed by a letter that represents the time unit. For example, 30m means 30 minutes.
An invocation example is as follows:
pt-slave-delay --delay 3h --interval 10m -uroot -proot