The FEDERATEDX
and
CONNECT
storage engines allow us to use a remote table as if it were located in the local server. Local FEDERATEDX
and CONNECT
tables act as a proxy between the client and the remote server. When the client sends a SQL statement to the tables, the tables send it to the remote server; when the remote server returns some results, the tables send the result to the client.
This is not the most advanced solution to share data between multiple servers. The SPIDER
storage engine has more features and optimizations, as explained in the next section. However, FEDERATEDX
and CONNECT
may have some advantages.
MariaDB Knowledge Base explains that the initial version of FEDERATEDX
has been developed for Cisco. Its devices did not have much storage space, so they needed a MySQL storage engine to access remote data. This engine at the time was called FEDERATED
, and was included in MySQL 5.0. More features were added in MySQL 5.1 to make it more usable for the majority of users. Since MariaDB developers thought that Oracle did not invest enough resources to maintain FEDERATED
, they created a fork called FEDERATEDX
, which is included in MariaDB. The author of this fork is the original developer of FEDERATED
. The fork is fully compatible, but it contains more bug fixes and features. The most notable features are transactions (when the underlying table supports them) and the support for the ODBC protocol, to connect to database systems other than MariaDB or MySQL.
The CONNECT
storage engine has already been mentioned in several chapters of this book. It basically allows the user to access a wide variety of external data sources as if they were MariaDB tables. This includes non-relational data sources, such as the text files written in several formats; CSV, XML, HTML and INI are just some examples, and the files can also be compressed with gzip. Even more exotic data sources are also supported; for example, on Windows, even directories and MAC network addresses can be read as tables. One of the supported data sources is a connection to a remote database server. Specific MariaDB/MySQL protocols are supported, as well as the generic ODBC standard. The CONNECT
storage engine was then included in MariaDB 10.0.
While both FEDERATEDX
and CONNECT
are able to transparently use a remote MariaDB or MySQL table or view, their different histories determine different features and optimizations.
Some common features of FEDERATEDX
and CONNECT
are as follows:
CONNECT
supports virtual columns.For backward compatibility, the FEDERATEDX
storage engine must be referred to as FEDERATED
in MariaDB. It is a built-in plugin, so it need not be installed and cannot be uninstalled.
Let's see how to create the FEDERATEDX
table with some examples. First, we need to create the underlying table on a server, which we will call remote. We can use any storage engine. Consider the following example:
remote> CREATE TABLE db1.user ( -> id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT, -> username VARCHAR(50), -> password VARCHAR(50), -> PRIMARY KEY (id) -> ) -> ENGINE = InnoDB; Query OK, 0 rows affected (0.47 sec)
Then, we will create FEDERATEDX
on another server. We will do this on a server called local
, shown as follows:
local> CREATE TABLE test.user_fed -> ENGINE = FEDERATED -> CONNECTION = 'mysql://user1:pwd@remote_server/db1/user'; Query OK, 0 rows affected (0.20 sec)
We did not provide the structure of the table, so it will be identical to the structure of the remote table. The CONNECTION
table option specifies the information necessary to connect to the remote table. The string passed to this option is called a connection string and it contains the following information:
mysql://
user1
pwd
remote_server
db1
user
If not specified, the database name and table name are the same as the ones used for the FEDERATEDX
table.
With very old versions of the FEDERATED
storage engine, the COMMENT
table option was used instead of CONNECTION
to store the connection data. It was sort of a trick implemented to work around issues with the storage engine's architecture. As a side effect, the user could not associate a comment to the FEDERATED
tables. In modern versions of both FEDERATEDX
and FEDERATED
, the CONNECTION
option must be used.
The syntax shown previously to create a FEDERATEDX
table is convenient when we want to access only one table on a remote server. However, if a remote server contains several tables that we want to access, we do not want to repeat the same long connection string for each table.
A good solution is to define a link to a remote server or, more specifically, a remote database. This can be done using the CREATE SERVER
statement. A link created in this way can be used with all the storage engines that allow access to tables from remote servers:
FEDERATED
and FEDERATEDX
storage enginesCONNECT
storage engineSPIDER
storage engineHere's an example of CREATE SERVER
usage:
CREATE SERVER srv1 FOREIGN DATA WRAPPER `mysql` OPTIONS ( USER 'root', PASSWORD 'root', HOST '127.0.0.1', DATABASE 'remote_server' );
Note that there is no way to specify a table name.
We can now create a new table using this link:
CREATE TABLE test.t ENGINE = FEDERATED CONNECTION 'srv1/user';
The connection string, in this case, is the link name followed by the table name. If the table name is omitted, the remote table name is assumed to be equal to the new table name.
Links are stored in the tables present on the servers, in the mysql
system database. Like all tables in the mysql
database, servers should not be directly modified, but queried to examine the existing links, shown as follows:
MariaDB [test]> SELECT * FROM mysql.servers WHERE Server_name = 'srv1' G *************************** 1. row *************************** Server_name: srv1 Host: remote_server Db: db1 Username: root Password: root Port: 0 Socket: Wrapper: mysql Owner: 1 row in set (0.00 sec)
Note that the only supported Wrapper
is mysql
, and the Owner
property is currently not supported.
The following statement can be used to drop a link:
DROP SERVER IF EXISTS srv1;
The existing links cannot be edited, but they can be dropped and recreated. Note that dropping or recreating them has no effect on the existing tables. To update a table definition, it is necessary to drop and recreate it.
The
CONNECT
storage engine supports several table types. Each of them allows us to use a different type of data source. The data sources that allow us to communicate the remote DBMSs are MYSQL
and ODBC
. The MYSQL
type works for both MariaDB and MySQL
servers, using the native protocol. The ODBC
type works for any DBMS that supports the ODBC
standard. In this chapter, we will only discuss the MYSQL
type.
The syntax used to create a FEDERATEDX
table can also be used to create a MYSQL CONNECT
table, except that no indexes can be defined:
CREATE TABLE test.user ( id INTEGER UNSIGNED NOT NULL, username VARCHAR(50), password VARCHAR(50) ) ENGINE = CONNECT TABLE_TYPE = MYSQL CONNECTION = 'mysql://user1:pwd@remote_server/db1/user';
Similar to FEDERATEDX
, CONNECT
requires the connection string to be specified in the CONNECTION
option, not in COMMENT
.
A server name can be specified instead of a complete connection string:
CREATE TABLE test.user ( id INTEGER UNSIGNED NOT NULL, username VARCHAR(50), password VARCHAR(50) ) ENGINE = CONNECT TABLE_TYPE = MYSQL CONNECTION = 'srv1';
The CONNECT
storage engine also supports the DBNAME
and TABNAME
table options, which specify a remote database name and a remote table name. They can be combined with a connection string, shown as follows:
CREATE TABLE test.user ( id INTEGER UNSIGNED NOT NULL, username VARCHAR(50), password VARCHAR(50) ) ENGINE = CONNECT TABLE_TYPE = MYSQL DBNAME = 'test' TABNAME = 'user' CONNECTION = 'mysql://user1:pwd@remote_server/';
If a database name and table name are provided in the connection string and via the DBNAME
and TABNAME
options, the values specified in the connection string will be used.
A view name can be specified instead of a table name, using any of the preceding syntaxes. It is also possible to specify a query, which will be used as an unnamed view:
CREATE TABLE test.user ( id INTEGER UNSIGNED NOT NULL, username VARCHAR(50), password VARCHAR(50) ) ENGINE = CONNECT TABLE_TYPE = MYSQL SRCDEF = 'SELECT * FROM db1.user WHERE id > 1000000' CONNECTION = 'mysql://user1:pwd@remote_server/';
The CONNECT
storage engine allows us to send arbitrary SQL statements to a remote server. This is mainly useful for sending administrative commands or creating new tables remotely without directly accessing a remote MariaDB server.
To be able to directly send statements to a remote server, a special CONNECT
table must be created. Here is an example:
CREATE TABLE srv1_sql ( statement VARCHAR(128) NOT NULL FLAG = 0, number INTEGER NOT NULL FLAG = 1, message VARCHAR(255) FLAG = 2, warnings INTEGER NOT NULL FLAG = 3 ) ENGINE = CONNECT, TABLE_TYPE = MYSQL, CONNECTION 'srv1', OPTION_LIST = 'Execsrc=1,Maxerr=10';
The table and column names are not relevant and can be defined arbitrarily. We used the name of a defined server followed by the _sql
suffix. This seems quite logical, if we decide to define special CONNECT
tables to send statements to all the remote servers. This can be useful even if we decide to use FEDERATEDX
or SPIDER
to access the remote tables. However, we must be cautious while setting the permissions for these special tables. Usually, they should only be accessible by root, or other users who have the SUPER
privilege.
The OPTION_LIST
table option is what makes CONNECT
aware of the purpose of this special table. The key option is Execsrs
. The Maxerr
option can be used to set the maximum number of errors and warnings that can be received from the remote server, in response to each statement we send.
Each column in the table has a special meaning. While their meanings are made clearer for us if we use descriptive names, the purpose of the columns is defined using FLAG
, a column option
specific to CONNECT
. For example, renaming the message
column will not alter its purpose, as long as the FLAG
option remains unchanged. The following table shows the meanings of the various FLAG
values:
FLAG value |
Column meaning |
---|---|
|
Statement to be executed |
|
Number of rows affected by a write statement, or returned by a query |
|
An informative message |
|
Number of errors and warnings |
The default value for the FLAG
option is 0
, so FLAG=0
can always be omitted in the CREATE TABLE
statement.
The following example shows how to execute a statement remotely using the previously created table:
MariaDB [test]> SELECT * FROM srv1_sql WHERE statement = ' '> CREATE OR REPLACE TABLE db1.new_table (id INT) ENGINE = InnoDB '> '; +------------------------------------------------------------------+--------+---------------+----------+ | statement | number | message | warnings | +------------------------------------------------------------------+--------+---------------+----------+ | CREATE OR REPLACE TABLE db1.new_table (id INT) ENGINE = InnoDB | 0 | Affected rows | 0 | +------------------------------------------------------------------+--------+---------------+----------+ 1 row in set (0.85 sec)
The executed statement has been sent via the WHERE
condition and then returned in the statement column. We know that the statement succeeded because the warnings
value is 0
. The message
and number
columns tell us that the number of rows affected by CREATE TABLE
is 0
; this is normal, because CREATE TABLE
always returns this value.
Just to verify that the CREATE TABLE
statement succeeded on the remote server, we will now create a CONNECT
table that is linked to it:
MariaDB [test]> CREATE TABLE new_table ( -> id INT -> ) -> ENGINE = CONNECT, -> TABLE_TYPE = MYSQL, -> CONNECTION 'srv1/new_table'; Query OK, 0 rows affected (0.13 sec) MariaDB [test]> SELECT COUNT(*) FROM new_table; +----------+ | COUNT(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)
It is also possible to send more than one statement to the remote server using a single SELECT
statement against the CONNECT
table. This can be done using the IN
operator. In this case, a row will be returned for each executed statement:
MariaDB [test]> SELECT * FROM srv1_sql WHERE statement IN ( -> 'SET @@global.innodb_file_per_table = 1', -> 'SET @@global.innodb_strict_mode = 1' -> ); +----------------------------------------+--------+---------------+----------+ | statement | number | message | warnings | +----------------------------------------+--------+---------------+----------+ | SET @@global.innodb_file_per_table = 1 | 0 | Affected rows | 0 | | SET @@global.innodb_strict_mode = 1 | 0 | Affected rows | 0 | +----------------------------------------+--------+---------------+----------+ 2 rows in set (0.00 sec)
By default, only fatal errors are returned, and they are contained in the message (FLAG=2
) column:
MariaDB [test]> SELECT * FROM srv1_sql WHERE statement = 'SET @@global.performance_schema = 1' G *************************** 1. row *************************** statement: SET @@global.performance_schema = 1 number: 1238 message: Remote: Variable 'performance_schema' is a read only variable warnings: 0 1 row in set (0.23 sec)
In most situations, this is enough, but there are cases in which we prefer to assess each warning in order to debug messages. To do this, there are three special commands that we can send to the special table as if they were normal SQL statements. The CONNECT
storage engine will parse those commands and it will not send them to the remote server. Examples of such commands are:
Note
Warning
Error
They tell CONNECT
to show notes, warnings, and errors in the result set of the current SELECT
statement. For example:
MariaDB [test]> SELECT * FROM srv1_sql WHERE statement IN ( -> 'Note','Warning', 'Error', -> 'DROP TABLE IF EXISTS t1', -> 'Note','Warning', 'Error', -> 'CREATE TABLE t1 (c INT) ENGINE = MySAM' -> ); +----------------------------------------+--------+--------------------------------------------+----------+ | statement | number | message | warnings | +----------------------------------------+--------+--------------------------------------------+----------+ | DROP TABLE IF EXISTS t1 | 0 | Affected rows | 1 | | Note | 1051 | Unknown table 'db1.t1' | 0 | | CREATE TABLE t1 (c INT) ENGINE = MySAM | 0 | Affected rows | 2 | | Warning | 1286 | Unknown storage engine 'MySAM' | 0 | | Warning | 1266 | Using storage engine InnoDB for table 't1' | 0 | +----------------------------------------+--------+--------------------------------------------+----------+ 5 rows in set (0.42 sec)
In the preceding example, the first Note
informs us that the table we tried to DROP does not exist. More importantly, the last two warnings tell us that we mistyped MyISAM, and thus, the InnoDB storage engine has been used instead.
This technique does not provide a direct way to retrieve a result set from a remote server. However, this can be done with an easy procedure:
CREATE TABLE … SELECT
statement to the remote serverCONNECT
table that points to the new remote tableNote that the local table must be recreated every time the remote table changes, unless its columns remain the same.
The following example shows how to retrieve the remote server version:
MariaDB [test]> SELECT * FROM srv1_sql WHERE statement = ' '> CREATE OR REPLACE TABLE a.output AS SELECT version(), @@global.version_comment '> '; +----------------------------------------------------------------------------------+--------+---------------+----------+ | statement | number | message | warnings | +----------------------------------------------------------------------------------+--------+---------------+----------+ | CREATE OR REPLACE TABLE a.output AS SELECT version(), @@global.version_comment | 1 | Affected rows | 0 | +----------------------------------------------------------------------------------+--------+---------------+----------+ 1 row in set (0.75 sec) MariaDB [test]> CREATE OR REPLACE TABLE output -> ENGINE = CONNECT, -> TABLE_TYPE = MYSQL, -> CONNECTION 'srv1/output'; Query OK, 0 rows affected (0.09 sec) MariaDB [test]> SELECT * FROM output; +---------------------+--------------------------+ | version() | @@global.version_comment | +---------------------+--------------------------+ | 10.0.11-MariaDB-log | MariaDB Server | +---------------------+--------------------------+ 1 row in set (0.25 sec)
The TBL
table is another useful table type. It represents a collection of identical (or very similar) CONNECT
tables. Such a collection can be queried as if it were a single table. Used with the MYSQL
table type, TBL
allows us to implement a simple form of data sharding over several servers. It is possible to define a local set of CONNECT MYSQL
tables, which are linked to several remote tables spread on multiple servers. A table of type TBL
can then be built on such MYSQL
tables. Queries against the TBL
table will be sent to the MYSQL
tables, and data will be retrieved from all the relevant remote servers.
Note that the TBL
table type has an important limitation: it is read-only. Data in the underlying tables cannot be modified through TBL
, which makes it unusable in many situations. However, it can still be an option when we want to shard read-only data through several MariaDB servers.
The syntax to create a TBL
table is as follows:
CREATE TABLE <table_name> ENGINE = CONNECT, TABLE_TYPE = TBL, TABLE_LIST = '<table_list>';
The <table_list>
variable is a comma-separated list of CONNECT
table names. Each name can optionally be specified as database_name.table_name
. If the database name is omitted, the table is assumed to be located in the same database as the TBL
table.
While a TBL
table, technically, can only be built on CONNECT
tables, it can still be indirectly linked to tables that use different storage engines, such as InnoDB or MyISAM. This is done by creating MYSQL
tables that point to such tables and build a TBL
table on them. The result will be slower than a direct access because a connection to the local server will be used. However, it should still be faster than connecting a remote server. This technique is useful if we want to distribute data over multiple servers, including the local server.
The CONNECT
engine can solve an important performance issue that cannot be solved easily using other storage engines, such as FEDERATEDX
or SPIDER
. Imagine that we work for a company that has several physical stores. Each store has its own database, which contains, among other data, the stocked products, product categories, and recent transactions. Assume that we are asked to write a query, which returns the average quantity of phones sold last week in all the stores. To do this, we need to perform JOIN
between three tables: product_category
, product
, and transaction
.
However, this JOIN
must be executed separately for each store; comparing the transactions of one store with the products of another store may or may not lead to wrong results, depending on how the system is designed, but it would require a huge amount of network traffic and would be way too slow. To execute the JOIN
query on a store basis, we can create a MYSQL CONNECT
table for each store; this table will be based on that query, using the SRCDEF
option as shown in the following code. Such tables, when queried, will return the associations between categories, products, and transactions for each store. Then, we can build a TBL
table on top of these MYSQL
tables, and perform a query that returns the desired average. Only the relevant rows will be sent from the databases in the stores to the local MariaDB server.
A special syntax can be used to achieve the same result without defining a new CONNECT
table for each remote server. The SRCDEF
option can be specified directly in the TBL
table definition, and a list of remote servers can be provided in the TABLE_LIST
option:
CREATE TABLE <table_name> ENGINE = CONNECT, TABLE_TYPE = TBL, SRCDEF = '<query>', TABLE_LIST = '<server_list>';