Chapter 11. Data Sharding

In this chapter, we will discuss three important forms of data sharding provided by MariaDB:

  • Balancing the I/O over multiple storage devices
  • Implementing a simple cluster using FEDERATEDX or CONNECT
  • The SPIDER storage engine

Distributing files between multiple disks

The bottleneck of a database server is usually the I/O. Reading or modifying data that is not stored in the memory implies accessing a storage device. Of course, buying fast disks will speed up the I/O operations, and using SSD devices could be further optimization. However, any existing storage device is just too slow to satisfy a high number of accesses per second. The main way to diminish this problem is to properly configure the caches, as described in Chapter 6, Caches. However, the set of data that is often accessed might be too large for the RAM to contain it entirely. Also, the logs probably need to be written frequently. A good configuration can mitigate this problem, but database reliability always requires data to be written to disks.

Also, the capacity of the storage devices is limited. A big database cannot be entirely contained in one device. This section describes how to distribute physical files over multiple storage devices.

Determining the path of table files

When a table is created, its files are placed in the data directory of the server by default. Its path is determined by the @@datadir server variable, which is not dynamic and thus, can only be set in the options files or via the --datadir startup option.

It is possible to specify different paths for a table's data and index files using the DATA DIRECTORY and INDEX DIRECTORY table options. Specifying only DATA DIRECTORY does not affect the path of the index file.

If a specified path does not exist, an error is produced and the table is not created.

For storage engines that do not use a separate file for indexes, such as InnoDB, specifying INDEX DIRECTORY generates a warning shown as follows:

Warning (Code 1030): Got error 140 "Wrong create options" from storage engine InnoDB

Consider the following example:

CREATE TABLE chars (
  ch CHAR(1)
)
  ENGINE = MyISAM,
  DATA DIRECTORY = '/tmp',
  INDEX DIRECTORY = '/tmp';

The DATA DIRECTORY and INDEX DIRECTORY table options can only be specified on table creation. If we try to change them with an ALTER TABLE statement, we obtain a warning:

MariaDB [test]> ALTER TABLE chars DATA DIRECTORY = '/any/other/path';
Query OK, 0 rows affected, 1 warning (0.13 sec)    
Records: 0  Duplicates: 0  Warnings: 1
Warning (Code 1618): <DATA DIRECTORY> option ignored

For partitioned tables, the DATA DIRECTORY and INDEX DIRECTORY options can be set for individual partitions. This technique has been discussed in Chapter 10, Table Partitioning.

InnoDB can have a separate data directory. It can be specified in the @@innodb_data_home_dir variable, in the configuration file. By default, the data directory of MariaDB is also used by InnoDB.

As explained in Chapter 7, InnoDB Compressed Tables, InnoDB has a file-per-table mode. It is enabled if the @@innodb_file_per_table variable is set to ON, which is the default value in MariaDB 10.0 but not in older versions. If the file-per-table mode is ON, any table can be created with the DATA DIRECTORY option. If it is OFF, all tables are created in the system tablespace, and thus this option does not make sense. The system tablespace is stored in the data directory of InnoDB. Storing the system tablespace on a rotating disk and the separate tablespaces on SSD devices is good practice.

By default, the data directory also contains the server logs. These logs have been discussed in Chapter 2, Debugging and Chapter 3, Optimizing Queries, while Chapter 8, Backup and Disaster Recovery summarizes the logs that exist and how to change their path.

InnoDB logfiles

InnoDB keeps two special logs to implement transactions:

  • The undo log
  • The redo log

The undo log is used to rollback transactions if they fail, or in case they remain incomplete. The changes to data are always made to the data itself (in the cache or on the disk), not on separate copies. Thus, in case a transaction rolls back, the data that it modified must return to its previous state. The undo log contains a copy of the original data and the information that is necessary to restore it.

As explained in Chapter 6, Caches, all modifications are usually written to the buffer pool. Pages that are modified in-memory are called dirty pages. At some point, they need to be flushed to the disk in order to make the changes persistent. However, before they are flushed, a disaster might occur, for example, a MariaDB crash, a system crash, or a power failure. InnoDB must protect the data from similar accidents. To do so, it temporarily writes the modifications into the redo log on the disk. If a disaster occurs, when MariaDB is restarted, InnoDB will replay all the transactions that are recorded in the redo log.

These logs must be written regularly on a storage device. Also, long-running transactions lead to large transaction logs. For these reasons, it is sometimes a good idea to store them on separate devices.

Both these logs cause many nonsequential writes, thus they perform better when stored on SSD devices.

Configuring the undo log

By default, the undo log is written into the system tablespace, in the InnoDB data directory. To store it in a different path, perhaps on a different storage device, the following variables must be set:

  • @@innodb_undo_directory: This variable determines where the undo log files are created.
  • @@innodb_undo_tablespaces: This variable represents the number of tablespaces (files) in which the undo log is written. If its value is 0, the default, the undo log is written in the system tablespace. In all other cases, the undo log files are written in the path specified with @@innodb_undo_directory. The maximum value is 126.

Both the preceding variables are not dynamic, and must be set in the configuration files or via the --innodb_undo_directory and --innodb_undo_tablespaces startup options.

For performance reasons, we may also want to set @@innodb_undo_logs, which is a dynamic variable. If the performance_schema table regularly shows a mutex on the undo log, increasing the number of segments per log file is probably a good idea. The maximum value is 128.

Tip

Be aware that the number of undo log files never decreases while the server is running. This value should not be increased, unless we are sure that it is necessary to reduce the mutex contention. Testing new values on the development server before using them in production is good practice.

Configuring the redo log

The storage of redo log files is affected by the following variables:

  • @@innodb_log_group_log_dir: This is the path where the files are stored
  • @@innodb_log_files_in_group: This is the number of redo log files
  • @@innodb_log_file_size: This is the size of each individual file

These variables are not dynamic.

The redo log files have names starting with ib_logfile followed by a progressive number, starting from 0. InnoDB starts populating the first file until it reaches its maximum size. When the last file reaches the size limit, InnoDB reuses the first file. The total maximum size of all files has a limit, which is close to 512 GB. Setting the limit to 500 GB is safe.

By default, two redo log files are stored in the MariaDB data directory. They are called ib_logfile0 and ib_logfile1. Each of them has a 48 MB size limit.

While moving the redo log files to a separate device can reduce the I/O on the main disk, it is not recommended to change the number of log files.

Note

Note that large values for @@innodb_log_file_size, in MariaDB versions older than 5.5, caused InnoDB recovery to take a long time. This is no longer an issue on the modern versions of MariaDB.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset