As explained in the previous section, storage engines implement data handling at the physical level. They handle the data files, the data, and the index caches if they exist, and whatever is necessary to efficiently manage and read the data.
The .frm
files are an exception. For each table, one .frm
file exists. These files contain the definition of the table, and are created and used by the server.
Using the SHOW ENGINES
statement or querying the information_schema.ENGINES
table, it is possible to see the available storage engines. The following output is obtained with a standard MariaDB 10.0.6 installation:
MariaDB [(none)]> SELECT ENGINE, SUPPORT FROM information_schema.ENGINES G *************************** 1. row *************************** ENGINE: FEDERATED SUPPORT: YES *************************** 2. row *************************** ENGINE: MRG_MyISAM SUPPORT: YES *************************** 3. row *************************** ENGINE: CSV SUPPORT: YES *************************** 4. row *************************** ENGINE: BLACKHOLE SUPPORT: YES *************************** 5. row *************************** ENGINE: MEMORY SUPPORT: YES *************************** 6. row *************************** ENGINE: MyISAM SUPPORT: YES *************************** 7. row *************************** ENGINE: ARCHIVE SUPPORT: YES *************************** 8. row *************************** ENGINE: InnoDB SUPPORT: DEFAULT *************************** 9. row *************************** ENGINE: PERFORMANCE_SCHEMA SUPPORT: YES *************************** 10. row *************************** ENGINE: Aria SUPPORT: YES
A list of available engines will be displayed along with a SUPPORT
column that indicates whether the engine is available.
When a table is created, a storage engine should be specified. If not, the default storage engine will be used. The default storage engine is specified in the storage_engine
system variable, as showed in the following example:
MariaDB [(none)]> SELECT @@global.storage_engine; +-------------------------+ | @@global.storage_engine | +-------------------------+ | InnoDB | +-------------------------+
The TABLES
table in the information_schema
database has a column called ENGINE
, which can be read to check which storage engine is used for a particular table, shown as follows:
MariaDB [(none)]> SELECT ENGINE FROM information_schema.TABLES WHERE TABLE_SCHEMA='test' AND TABLE_NAME='t1'; +--------+ | ENGINE | +--------+ | InnoDB | +--------+
A brief description of the available storage engine follows the preceding code.
InnoDB became the default engine with MariaDB 5.5 and MySQL 5.5. Percona maintains an InnoDB fork called XtraDB; it is InnoDB with bug fixes applied by Percona, and some unique features (mainly for performance and monitoring). By default, MariaDB uses XtraDB. For compatibility with InnoDB and MySQL, the commands still mention InnoDB but the XtraDB fork is used instead. However, this behavior can be changed by compiling the server with InnoDB, instead of XtraDB. This is not necessary since any new code can come with new bugs or unexpected performance problems, and XtraDB is not an exception.
InnoDB is a high-performance, general-purpose storage engine that supports transactions with savepoints, XA transactions, and foreign keys. Savepoints are intermediate states that can be saved in the middle of a transaction and can then be restored if necessary. XA is a special type of transaction designed for operations that involve multiple resources, not necessarily SQL databases. In most cases, InnoDB performance is better than other engines. For this reason, this book will focus on XtraDB, which will be used for examples where another engine is not explicitly specified. For simplicity, XtraDB will be generally called InnoDB, except when describing features that are not supported by InnoDB.
InnoDB transactions are implemented via a complex locking system and undo logs. Each lock involves a single row or a range of rows; rows are identified using index records. Undo logs are used to rollback transactions when necessary, and can be stored in the system tablespace or elsewhere.
This storage engine is developed by Tokutek and has been included in MariaDB since Version 5.5, though it must be installed and enabled separately. It supports transactions with savepoints, XA transactions, but not foreign keys and full-text indexes. It is very different from InnoDB. Its main peculiarity is the use of a new data structure for indexes: the fractal trees. They are very similar to the commonly used B-trees, but each node has a buffer. This buffer contains the changes that need to be applied to the nodes that are more in-depth. Only when the buffer is full are the changes applied altogether. If the changes need to be written to disk, this is an important optimization, because writing fewer and bigger blocks is usually much faster. Also, this is not a problem with fractal trees fragmentation.
Another important feature of TokuDB is data compression. Of course, its compression level depends on the dataset, but it is generally much higher than the one provided by other storage engines. This happens because the write operations are grouped together. Data compression is always used in TokuDB and cannot be disabled.
Fractal trees and compression make TokuDB suitable to work with datasets that are too big to be entirely stored in memory. For such workloads, TokuDB can be faster than InnoDB. For most purposes, TokuDB offers reduced performance and has fewer features.
MyISAM was historically the default storage engine for MySQL and MariaDB, before Version 5.5. It is a relatively simple engine, optimized for read-heavy workloads where there are just a few writes or no writes at all. In practice, MyISAM is good for data warehousing and more generally for data reporting where data can be appended to tables, but not modified or deleted.
MyISAM writes two files for each table: a data
file and an index
file. The index
file can always be rebuilt if it gets damaged for some reason. Copying data
files (and the .frm
files), even across different machines, is sufficient to back up and restore MyISAM tables.
Three data formats are available: FIXED
, DYNAMIC
, and COMPRESSED
. The FIXED
data format assigns a fixed length to columns, while DYNAMIC
saves space when possible. The FIXED
data format is faster, more reliable, and harder to fragment. The COMPRESSED
data format is used to create small read-only tables.
Aria is designed to be MyISAM's successor. It uses logs that allow data recovery after a crash. Data changes are atomic in Aria; they are applied entirely, or the table is damaged. Aria uses a different data format called PAGE
that is generally faster and never fragments too much, but it is possible to use the FIXED
or DYNAMIC
formats for compatibility with MyISAM (where the table will not be crash-safe).
Aria can be better than MyISAM in environments where there is concurrency, and the MariaDB Knowledge Base suggests using Aria for new applications. Yet, users should be aware that bulk writes are slower in Aria, particularly where duplicate indexed values exist.
Both MyISAM and Aria do not support transactions and foreign keys, but as explained previously, each statement on an Aria table can be considered a transaction. Even full-text indexes are supported by MyISAM and Aria.
The MRG_MyISAM
storage engine, also called MERGE
, can be used to build a table on multiple MyISAM identical tables, to work around the file size limit of the operating system.
The storage engines described up to this point are of general purpose, even if some of them are only suitable for some particular workloads. Other storage engines use non-standard input or output methods, or process queries in a non-standard way, and thus are used for very specific purposes described as follows:
gzip
format. A CONNECT
table can also be used to transform data contained in other tables, for example, to merge tables or reorganize data into a pivot table. This storage engine will probably obsolete some older storage engines: CSV, which accesses the CSV files; FEDERATED, an engine inherited from MySQL, which can access tables from other MariaDB or MySQL instances; and FEDERATEDX, added in MariaDB because FEDERATED was no longer maintained.seq_1_to_10
virtual table returns a result set with numbers from 1
to 10
. The seq_1_to_10_step_2
virtual table returns a similar series, but with an increment of 2
.performance_schema
databases. The only reason why a database administrator (DBA) should be aware of it is that a specific statement exists to check how much memory is consumed by the performance_schema
by using the SHOW ENGINE performance_schema STATUS
command.