A backup of a single table can be a delimited text file: a human-readable text file in which the column values are separated by a specific character. The most common example is the CSV format, where the values are separated by a comma. MariaDB supports the following ways to create text-delimited files:
mysqldump
command with the --tab
optionSELECT … INTO OUTFILE
commandCSV
storage engineCONNECT
storage engineMariaDB also supports the following methods to restore a text limited backup:
mysqlimport
commandLOAD DATA INFILE
commandCSV
storage engineCONNECT
storage engineThe
mysqldump
command, when invoked with the --tab
option, produces two files for each dumped table. The name of these files is the name of the original table, followed by an extension. One file has a .sql
extension, and it contains the CREATE TABLE
statement that is necessary to recreate the empty table. Of course, this file needs to be executed first to restore a dump unless the table already exists. The other file has a .txt
extension, and it contains a delimited text backup. By default, a tab character is used to separate values, and a new line character is used to separate lines. But the used characters as well as many file characteristics can be configured using some mysqldump
options. These options will be discussed later in this section, because they are used by multiple tools and statements.
The --tab
option specifies the path where the .sql
and .txt
files will be located. For example: --tab=/tmp/backup
.
The mysqlimport
command is a tool that is complementary to mysqldump
and can be used to import delimited text backups. Like mysqldump
, it is included in all MariaDB distributions and is located in the bin
directory. Its syntax is as follows:
mysqlimport [options] db_name file [file …]
A database name must be specified so that mysqlimport
knows where the table is stored. Then, at least one file to import must be specified. The base name of the file must be the same as the table being referred to. The extension, if it exists, is not relevant (it can be .csv
, .txt
, or whatever we prefer). This has an interesting consequence: two files with the same base name and different extensions refer to the same table. For very big tables, it could be convenient to split the rows into multiple files. The mysqlimport
command also has several options that can be used to specify which characters are used to separate columns and rows, and other characteristics of the files to import. These options are described later in this section.
By default, the delimited text file is expected to be in the server. If mysqlimport
is executed remotely, the file can also be located in the client; in this case, the --local
option must be specified.
It is possible to skip the first lines in the source file by specifying the --ignore-lines
option, for example, --ignore-lines=1
. This is useful when the first line is composed of the columns names, or the file begins with some informative lines (like the timestamp of its creation or the name of the software that produced it).
Another important option is --delete
, which empties the tables before importing the rows.
In case of duplicate values, the --replace
option causes the imported rows to replace the existing rows in the tables, while the --ignore
option leaves the existing rows untouched but avoids producing an error.
When importing many data, it may be useful to do it in parallel. The --use-threads
option specifies how many threads must be used to import data. For example, if we specify --use-threads=2
, mysqlimport
will use two threads.
The SELECT
statement has an INTO OUTFILE
clause, which causes the result set to be written into a file. By default, the file is saved into the MariaDB install directory (not the data
directory). However, a path can be specified with the filename. Remember that the MariaDB user needs to have the FILE
privilege to write or read files. Also, the system user used by MariaDB (which is usually mysql
) needs to have write access to the directory where the file is saved. On Linux systems, the /tmp
directory is usually a good candidate. Beware of the fact that, if the file already exists, an error is produced. Note, however, that the file needs to be located on the server. With SELECT … INTO OUTFILE
, we cannot create a file on the client or on any other host.
The result set will not be sent to the client, but it will receive the number of found rows (or an error).
Here is a simple example:
SELECT * ->FROM information_schema.TABLES ->ORDER BY TABLE_SCHEMA, TABLE_NAME ->INTO OUTFILE '/tmp/tables.txt';
By default, a tab character is used to separate columns values, and a new line character is used to separate rows. Several clauses exist to use different separators or other file characteristics. They will be discussed later, in the Separator options and clauses section, together with the corresponding options of mysqldump
and mysqlimport
.
The SELECT … INTO OUTFILE
command is mostly used to exchange data between servers or between MariaDB and other software. Using it to create a backup is not very common. However, this method proves its flexibility when we want to back up only a subset of tables data, for example, using JOIN
operations or WHERE
clauses.
Running a SELECT … INTO DUMPFILE
statement with the default separators is the same as running a query directly from the command line using the following syntax:
mysql -e "SELECT …" > file_name
Consider the following example:
mysql -e "SELECT * FROM information_schema.TABLES ORDER BY TABLE_SCHEMA, TABLE_NAME" > /tmp/tables.txt
This syntax is very convenient when we want to save the result sets of the queries into files from a shell script. Also, this allows creating a text delimited file on the client, instead of the server.
Sometimes, we want to obtain the SQL statement, which allows us to recreate a table structure and not only the data. The statement that does this is very simple; here is an example:
MariaDB [test]> SHOW CREATE TABLE customer G *************************** 1. row *************************** Table: customer Create Table: CREATE TABLE 'customer' ( 'id' int(11) NOT NULL AUTO_INCREMENT, 'hire_date' date NOT NULL, 'first_name' varchar(50) DEFAULT NULL, 'last_name' varchar(50) DEFAULT NULL, PRIMARY KEY ('id') ) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 1 row in set (0.00 sec)
If we need this dump, we probably want to obtain the statement that allows recreating the database, shown as follows:
MariaDB [test]> SHOW CREATE DATABASE test; +----------+---------------------------------------------------------------+ | Database | Create Database | +----------+---------------------------------------------------------------+ | test | CREATE DATABASE 'test' /*!40100 DEFAULT CHARACTER SET utf8 */ | +----------+---------------------------------------------------------------+ 1 row in set (0.00 sec)
Using both these statements, we will be able to recover data from a SELECT … INTO OUTFILE
statement, if the database has been accidentally destroyed. More commonly, these statements allow the exchange of data between servers.
SHOW CREATE TABLE
and SHOW CREATE DATABASE
do not support a clause to save the results into a file, such as SELECT
. However, we can write a simple script to do this.
The LOAD DATA INFILE
statement is complementary to SELECT … INTO OUTFILE
. It loads data from a delimited test file into an existing table.
The general syntax is:
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file' [REPLACE | IGNORE] INTO TABLE tab_name [PARTITION (p_name, …)] [CHARACTER SET charset] [other_options] [IGNORE n {LINES | ROWS}] [(column, …)] [SET column = expr, …]
In this syntax description, the options that specify the separator characters are grouped in the other_options
placeholder. They are identical to the ones used for SELECT … INTO OUTFILE
and will be discussed later in this section.
The LOW_PRIORITY
and CONCURRENT
clauses are only useful with non-transactional tables. The LOW_PRIORITY
clause causes the statement to have a lower priority than the read operations. The
CONCURRENT
clause means that the MyISAM concurrent inserts should be used. Both these clauses may slow down the statement itself, but they will not block concurrent statements: queries from other sessions will have the priority if LOW_PRIORITY
is used, and concurrent inserts will be allowed if CONCURRENT
is used.
The LOCAL
clause means that the specified file must be sent by the client to the server. If this clause is not specified, the file is supposed to be on the server. In this case, the MariaDB user needs to have the FILE
privilege. Also, the system user used by MariaDB needs permissions to read the file. Another difference is that, with the LOCAL
keyword, duplicate key errors are turned into warnings and do not abort the whole operation.
The REPLACE
and IGNORE
clauses are used to handle duplicate values. With REPLACE
, the new rows replace the existing rows. With IGNORE
, the existing rows are left untouched, and no duplicate key error is produced.
The INTO TABLE
clause specifies the target table and, optionally, one or more target partitions.
It is possible to skip the first lines in the source file by specifying IGNORE n LINES
. This is mainly useful if the first line of the file contains the column headers.
By default, the server assumes that the columns are ordered in the same way both in the file and in the table. The order of a table's columns is the order in which they appeared in the CREATE TABLE
statement unless ALTER TABLE
explicitly changed their order. The order can be seen with a simple DESC
statement (which shows a table's columns).
The order can (and usually should) be explicitly specified between the parentheses in the same way we specify them in the INSERT
statements.
It is also possible to populate one or more columns with a calculated value with the SET
clause. For example, if a product
table has a price
column; it could also have a sales_tax
column that is 10 percent of price
. Since MariaDB supports VIRTUAL
and PERSISTENT
calculated columns, there is usually no need to insert the calculated value with LOAD DATA INFILE
. But we may still want to insert those values for some reason, for example, because the database was designed years ago, when such features did not exist, and we do not want to modify it. This may be for MySQL compatibility or because the expression that calculates the values is not deterministic, due to which the database cannot be used for a VIRTUAL
column. The LOAD DATA INFILE
clause has a SET
clause that can be used to insert calculated values:
SET sales_tax = price / 100 * 20
The mysqldump
and mysqlimport
command-line tools, and the SELECT … INTO OUTFILE
and LOAD DATA INFILE SQL
statements, have a set of options that can be used to specify the characters used to: separate values, enclose strings, escape special characters in strings, and separate rows.
These options are the same for all these tools, except that the SQL syntax is slightly different, and it is slightly more flexible for line separators.
The following table shows the options' syntax and their meaning:
The mysqldump and mysqlimport options |
The SELECT ... INTO OUTFILE and LOAD DATA INFILE clauses |
Description |
---|---|---|
|
|
Values are separated by this sequence of characters. |
|
|
String values are quoted using the specified sequence of characters. With the optional keyword, the quotes could be omitted when they are unnecessary. |
|
|
The specified character is used to escape the special characters defined with other options, the |
|
This is only used by | |
|
|
Lines are separated by this sequence of characters. |
In SQL statements, when using multiple field or line clauses, the FIELDS
and LINES
keywords must not be repeated. For example, a correct syntax is FIELD TERMINATED BY ',' ESCAPED BY '|'
. All the SQL clauses are optional; if they are present, they must appear in the same order that is used in the preceding table. Here, COLUMNS
is a synonym for FIELDS
.
We discussed how to create a dump file and how to restore it if necessary, using both SQL statements and command-line tools. Now, let's see a simple example. We will see how to create a logical backup of a table using SELECT INTO OUTFILE
, and then we will restore the data with LOAD DATA INFILE
.
First, let's create a small table with some example rows:
MariaDB [test]> CREATE TABLE customer ( -> id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, -> hire_date DATE NOT NULL, -> first_name VARCHAR(50), -> last_name VARCHAR(50) -> ) -> ENGINE = InnoDB, -> CHARACTER SET = 'utf8'; Query OK, 0 rows affected (0.41 sec) MariaDB [test]> INSERT INTO customer (hire_date, first_name, last_name) VALUES -> ('2011-05-07', 'David', 'Coverdale'), -> ('2010-01-20', 'Ritchie', 'Blackmore'), -> ('2012-11-15', 'Ian', 'Paice'), -> ('2011-06-01', 'Jon', 'Lord'), -> ('2010-02-28', 'Roger', 'Glover'), Query OK, 5 rows affected (0.10 sec) Records: 5 Duplicates: 0 Warnings: 0
Now, let's create a delimited text file using mysqldump
and check that the first rows are correct. The table is very small, so it is not really necessary to check only the first rows; however, in a more realistic case, it would be much better. We will also check that the table file exists, shown as follows:
root@this:/usr/local/mysql# mysqldump -uroot -proot test customer --tab=/tmp --fields-terminated-by=, --fields-enclosed-by="'" --fields-escaped-by=/ root@this:/usr/local/mysql# ls /tmp customer.sql customer.txt root@this:/usr/local/mysql# tail --lines 3 /tmp/customer.txt '3','2012-11-15','Ian','Paice' '4','2011-06-01','Jon','Lord' '5','2010-02-28','Roger','Glover'
Everything seems to be okay. Now, let's back up the same table with SELECT … INTO OUTFILE
. The statement that is going to issue is equivalent to the former example, except that it does not generate a table definition file. Consider the following code snippet:
MariaDB [test]> SELECT * -> FROM customer -> INTO OUTFILE '/tmp/customer.2.txt' -> FIELDS -> TERMINATED BY ',' -> ENCLOSED BY ''' -> ESCAPED BY ','; Query OK, 5 rows affected (0.00 sec)
Now, we want to check that the files produced are identical:
root@this:/usr/local/mysql# md5sum /tmp/customer.txt d6b2c04587f9dc56a82a8b9784abe5fe /tmp/customer.txt root@this:/usr/local/mysql# md5sum /tmp/customer.2.txt d6b2c04587f9dc56a82a8b9784abe5fe /tmp/customer.2.txt
Since the MD5 sums of the two files are identical, we can assume that the files are identical too.
Before trying to restore the file, we need to empty the table:
MariaDB [test]> TRUNCATE TABLE customer; Query OK, 0 rows affected (0.25 sec)
Now, let's restore the table from a delimited text file:
root@this:/usr/local/mysql# bin/mysqlimport -uroot -proot --fields-terminated-by=, --fields-enclosed-by="'" --fields-escaped-by=/ test /tmp/customer.txt test.customer: Records: 5 Deleted: 0 Skipped: 0 Warnings: 0
The equivalent LOAD DATA INFILE
statement is the following, but we will need to empty the table again, before issuing it:
MariaDB [test]> LOAD DATA INFILE '/tmp/customer.txt' -> INTO TABLE test.customer -> FIELDS -> TERMINATED BY ',' -> ENCLOSED BY ''' -> ESCAPED BY ','; Query OK, 5 rows affected (0.08 sec) Records: 5 Deleted: 0 Skipped: 0 Warnings: 0
The tables created with the
CSV
engine use normal comma-separated data files, which can be used for backups or data exchange. The
CONNECT
engine is more complex: it supports several table types. Each table type is in a different data format. Supported formats include CSV
, XML
, HTML
, and data files created by dBASE
. The CONNECT
engine can even read and write data from or to a remote database server using the native protocol if it is a MariaDB or MySQL server or using the ODBC standards for other DBMS types.
Creating a backup from a table using CSV
or CONNECT
is very simple. The next example shows how to do this with CSV
:
MariaDB [test]> CREATE TABLE customer_bkp ENGINE = CSV SELECT * FROM customer; ERROR 1178 (42000): The storage engine for the table doesn't support nullable columns
But this did not work! The example shows that CSV
cannot be used if a table contains NULL
values. This is a very important limitation, and it is not the only one. So, we should usually prefer CONNECT
, which is much more advanced and flexible. The only reason why we might use CSV
instead is probably that CONNECT
has been introduced with MariaDB 10 and cannot be installed on older versions.
Since CONNECT
is not installed by default, we may need to install it as follows:
MariaDB [test]> INSTALL SONAME 'ha_connect'; Query OK, 0 rows affected (0.00 sec)
Then, we can use it to perform the backup. We will use the CONNECT
storage engine's CSV
table type, because it is a good and efficient way to store data. We may use more structured or exotic formats, but there is no reason in this case. Pay attention to the table options in the following example:
MariaDB [test]> CREATE TABLE customer_bkp -> ENGINE = CONNECT -> TABLE_TYPE = CSV -> FILE_NAME = '/tmp/customer.csv' -> HUGE = 0 -> COMPRESS = 1 -> READONLY = 1 -> DATA_CHARSET = 'utf8' -> SEP_CHAR = ',' -> ENDING = 1 -> QUOTED = 1 -> QCHAR = '"' -> HEADER = 1 -> SELECT * FROM customer; Query OK, 0 rows affected (0.10 sec)
In this example, we used all the options that are relevant for the CSV
format. They are:
TABLE_TYPE
: As explained earlier, this indicates the data source type for the table (in this case, a CSV
file).FILE_NAME
: This indicates the name and, optionally, path of the data file.HUGE
: This indicates the default value, which is 0
. If the table is bigger than 2 GB, it makes sense to inform CONNECT
by setting it to 1
.COMPRESS
: Since this is a backup, we want the table to be compressed. As with InnoDB, CONNECT
uses the zlib
library and the LZ77 algorithm.READONLY
: Since this is a backup, making the table read-only is much safer.DATA_CHARSET
: This indicates the character set to be used.SEP_CHAR
: This indicates the columns separator.ENDING
: This indicates the length of the end of line in characters. It is 1
for Unix systems (lines end with
) and 2
on Windows (lines end with
).QUOTED
: Strings are quoted. This could be omitted since QCHAR
is specified.QCHAR
: This indicates the quoting character.HEADER
: This indicates the first row that contains the column names.Restoring the backup is really simple: we just need to delete the data file in /tmp
and replace it with the backup. No further actions are needed to let CONNECT
use the backup. Then, we can copy the backup contents into the original table using a normal INSERT … SELECT
or CREATE TABLE … SELECT
statement.