The gateway and recovery modules

Apart from our indices and the data indexed inside them, Elasticsearch needs to hold the metadata, such as the type mappings, the index level settings, and so on. This information needs to be persisted somewhere so it can be read during cluster recovery. Of course, it could be stored in memory, but full cluster restart or a fatal failure would result in this information being lost, which is not something that we want. This is why Elasticsearch introduced the gateway module. You can think about it as a safe heaven for your cluster data and metadata. Each time you start your cluster, all the needed data is read from the gateway and, when you make a change to your cluster, it is persisted using the gateway module.

The gateway

In order to set the type of gateway we want to use, we need to add the gateway.type property to the elasticsearch.yml configuration file and set it to the local value. Currently, Elasticsearch recommends using the local gateway type (gateway.type set to local), which is the default one and the only one available without additional plugins.

The default local gateway type stores the indices and their metadata in the local file system. Compared to the other gateways, the write operation to this gateway is not performed in an asynchronous way, so, whenever a write succeeds, you can be sure that the data was written into the gateway (so basically indexed or stored in the transaction log).

Recovery control

In addition to choosing the gateway type, Elasticsearch allows us to configure when to start the initial recovery process. The recovery is a process of initializing all the shards and replicas, reading all the data from the transaction log, and applying them on the shards. Basically, it's a process needed to start Elasticsearch.

For example, let's imagine that we have a cluster that consists of 10 Elasticsearch nodes. We should inform Elasticsearch about the number of nodes by setting gateway.expected_nodes to that value, so 10 in our case. We inform Elasticsearch about the number of expected nodes that are eligible to hold the data and eligible to be selected as a master. Elasticsearch will start the recovery process immediately if the number of nodes in the cluster is equal to that property.

We would also like to start the recovery after six nodes are together. To do this, we should set the gateway.recover_after_nodes property to 6. This property should be set to a value that ensures that the newest version of the cluster state snapshot will be available, which usually means that you should start recovery when most of your nodes are available.

There is also one more thing. We would like the gateway recovery process to start 5 minutes after the gateway.recover_after_nodes condition is met. To do this, we set the gateway.recover_after_time property to 5m. This property tells the gateway module how long to wait with the recovery process after the number of nodes reached the minimum specified by the gateway.recovery_after_nodes property. We may want to do this because we know that our network is quite slow and we want the nodes communication to be stable. Note that Elasticsearch won't delay the recovery if the number of master and data eligible nodes that formed the cluster is equal to the value of the gateway.expected_nodes property.

The preceding property values should be set in the elasticsearch.yml configuration file. For example: if we would like to have the previously discussed value in the mentioned file, we would end up with the following section in the file:

gateway.recover_after_nodes: 6
gateway.recover_after_time: 5m
gateway.expected_nodes: 10

Additional gateway recovery options

In addition to the mentioned options, Elasticsearch allows us some additional degree of control. These additional options are:

  • gateway.recover_after_master_nodes: This is similar to the gateway_recover_after_nodes property, but instead of taking into consideration all the nodes, it allows us to specify how many master eligible nodes should be present in the cluster before recovery starts
  • gateway.recover_after_data_nodes: This is also similar to the gateway_recover_after_nodes property, but it allows specifying how many data nodes should be present in the cluster before recovery starts
  • gateway.expected_master_nodes: This is similar to the gateway.expected_nodes property, but instead of specifying the number of all the nodes that we expect in the cluster, it allows specifying how many master eligible nodes we expect to be present
  • gateway.expected_master_nodes: This is similar to the gateway.expected_nodes property, but allows specifying how many master nodes we expect to be present
  • gateway.expected_data_nodes: This is also similar to the gateway.expected_nodes property, but allows specifying how many data nodes we expect to be present

Indices recovery API

There is also one other thing when it comes to the recovery process – the indices recovery API. It allows us to see the process of index or indices recovery. To use it, we just need to specify the indices and use the _recovery end-point. For example, to check the recovery process of the library index, we will run the following command:

curl -XGET 'localhost:9200/library/_recovery?pretty'

The response for the preceding command can be large and depends on the number of shards in the index and of course the amount of indices we want to get information for. In our case, the response looks as follows (we left information about a single shard to make it less extensive):

{
  "library" : {
    "shards" : [ {
      "id" : 0,
      "type" : "STORE",
      "stage" : "DONE",
      "primary" : true,
      "start_time_in_millis" : 1444030695956,
      "stop_time_in_millis" : 1444030695962,
      "total_time_in_millis" : 5,
      "source" : {
        "id" : "Brt5ejEVSVCkIfvY9iDMRQ",
        "host" : "127.0.0.1",
        "transport_address" : "127.0.0.1:9300",
        "ip" : "127.0.0.1",
        "name" : "Puff Adder"
      },
      "target" : {
        "id" : "Brt5ejEVSVCkIfvY9iDMRQ",
        "host" : "127.0.0.1",
        "transport_address" : "127.0.0.1:9300",
        "ip" : "127.0.0.1",
        "name" : "Puff Adder"
      },
      "index" : {
        "size" : {
          "total_in_bytes" : 157,
          "reused_in_bytes" : 157,
          "recovered_in_bytes" : 0,
          "percent" : "100.0%"
        },
        "files" : {
          "total" : 1,
          "reused" : 1,
          "recovered" : 0,
          "percent" : "100.0%"
        },
        "total_time_in_millis" : 1,
        "source_throttle_time_in_millis" : 0,
        "target_throttle_time_in_millis" : 0
      },
      "translog" : {
        "recovered" : 0,
        "total" : -1,
        "percent" : "-1.0%",
        "total_on_start" : -1,
        "total_time_in_millis" : 4
      },
      "verify_index" : {
        "check_index_time_in_millis" : 0,
        "total_time_in_millis" : 0
      }
    },
    ...
    ]
  }
}

As you can see in the response, we see the information about each shard. For each shard, we see the type of the operation (the type property), the stage (the stage property) describing what part of the recovery process is in progress, and whether it is a primary shard (the primary property). In addition to this, we see sections about the source shard, the target shard, the index the shard is part of, the information about the transaction log, and finally information about the index verification. All of this allows us to see what is the status of the recovery of our indices.

Delayed allocation

We already discussed that by default Elasticsearch tries to balance the shards in the cluster accordingly to the number of nodes in that cluster. Because of that, when a node drops off the cluster (or multiple nodes do) or when nodes join the cluster, Elasticsearch starts rebalancing the cluster, moving the shards and the replicas around. This is usually very expensive – new primary shards may be promoted out of the available replicas, large amount of data may be copied between the new primary and its replicas, and so on. And this may be happening because a single node was just restarted for 30 seconds maintenance.

To avoid such situations, Elasticsearch provides us with the possibility to control how long to wait before beginning allocation of shards that are in unassigned state. We can control the delay by using the index.unassigned.node_left.delayed_timeout property and setting it on per index basis. For example, to configure the allocation timeout for the library index to 10 minutes, we run the following command:

curl -XPUT 'localhost:9200/library/_settings' -d '{
 "settings": {
  "index.unassigned.node_left.delayed_timeout": "10m"
 }
}'

We can also configure the allocation timeout for all the indices by running the following command:

curl -XPUT 'localhost:9200/_all/_settings' -d '{
 "settings": {
  "index.unassigned.node_left.delayed_timeout": "10m"
 }
}'

Index recovery prioritization

Elasticsearch 2.2 exposes one more feature when it comes to the indices recovery process that allows us to define which indices should be prioritized when it comes to recovery. By specifying the index.priority property in the index settings and assigning it a positive integer value, we define the order in which Elasticsearch should recover the indices; the ones with the higher index.priority property will be started first.

For example, let's assume that we have two indices, library and map, and we want the library index to be recovered before the map index. To do this, we will run the following commands:

curl -XPUT 'localhost:9200/library/_settings' -d '{
 "settings": {
  "index.priority": 10
 }
}'
curl -XPUT 'localhost:9200/map/_settings' -d '{
 "settings": {
  "index.priority": 1
 }
}'

We assigned higher priority to the library index and, because of that, it will be recovered faster.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset