Controlling cluster rebalancing

By default, ElasticSearch tries to keep the shards and their replicas evenly balanced across the cluster. Such behavior is good in most cases, but there are times where we want to control that behavior. In Chapter 7, Administrating Your Cluster, we discussed how to take total control over how shards and replicas are distributed. In this section, we will look at how to avoid cluster rebalance and how to control the behavior of this process in depth.

Imagine a situation where you know that your network can handle a very high amount of traffic or the opposite to that—your network is used extensively and you want to avoid too much stress on it. The other example is that you may want to decrease the pressure that is put on your I/O subsystem after a full-cluster restart and you want to have less shards and replicas being initialized at the same time. These are only two examples where rebalance control may be handy.

What is rebalancing?

Rebalancing is the process of moving shards between different nodes in our cluster. As we have already mentioned, it's fine in most situations, but sometimes you may want to completely avoid that. For example, if we define how our shards are placed and we want to keep it that way, we want to avoid rebalancing. However, ElasticSearch, by default, will try to rebalance the cluster whenever the cluster state changes and ElasticSearch thinks rebalancing is needed. This may happen, for example, after each full-cluster restart that can happen during upgrade, plugin installation, or after node failure.

When is the cluster ready?

We already know that our indices can be built of shards and replicas. Primary shards or just shards are the ones that are used when new documents are indexed, there is an update, or there is a delete—or just in case of any index change. We also have replicas, which get the data from the primary shards.

You can think of the cluster as being ready to use when all primary shards are assigned to their nodes in your cluster—as soon as the yellow health state is achieved. However, ElasticSearch may still initialize other shards—the replicas. But you can use your cluster and be sure that you can search your whole data set and you can send index change commands. Then those will be processed properly.

The cluster rebalancing settings

ElasticSearch lets us control the rebalancing process with the use of a few properties that can be set in the elasticsearch.yml file or by using ElasticSearch REST API. Since the first method is very simple, let's skip discussing it and look at the second method—the one using the REST API.

In order to set one of the properties described later, we need to use the HTTP PUT method and send a proper request to the _cluster/settings URI. However, we have two options—transient and permanent property settings.

The first one—the transient—will set the property only till the first restart. In order to do that, we send the following command:

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
  "transient" : {
    "PROPERTY_NAME" : "PROPERTY_VALUE"
  }
}'

As you can see, in the preceding command, we used the object named transient and we added our property definition there. This means that the property will be valid only until the next restart.

If we want our property setting to persist between restarts, instead of using the object named transient, we will use one named persistent. So, the sample command will be as follows:

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
  "persistent" : {
    "PROPERTY_NAME" : "PROPERTY_VALUE"
  }
}'

And now let's look at what ElasticSearch allows us to control.

Controlling when rebalancing will start

cluster.routing.allocation.allow_rebalance allows us to specify when rebalancing will start. This property can take the following values:

  • always: Rebalancing will be started as soon as it's needed
  • indices_primaries_active: Rebalancing will be started when all primary shards are initialized
  • indices_all_active: The default one, rebalancing will be started when all shards and replicas are initialized

Please note that the described property can't be changed during runtime, so you need to set it in the configuration file.

Controlling the number of shards being moved between nodes concurrently

cluster.routing.allocation.cluster_concurrent_rebalance allows us to specify how many shards can be moved between nodes at once in the whole cluster. If you have cluster that is built of many nodes, you can increase this value. This value defaults to 2.

Controlling the number of shards initialized concurrently on a single node

The cluster.routing.allocation.node_concurrent_recoveries property lets us set how many shards ElasticSearch may initialize on a single node at once. Please note that the shard recovery process is very I/O intensive, so you'll probably want to avoid too many shards being recovered concurrently. This value defaults to the same value as the previous one, 2.

Controlling the number of primary shards initialized concurrently on a single node

The cluster.routing.allocation.node_initial_primaries_recoveries property lets us control how many primary shards are allowed to be concurrently initialized on a node.

Disabling the allocation of shards and replicas

When the cluster.routing.allocation.disable_allocation property is set to true, it totally disables the allocation of primary shards and replicas. This setting only makes sense when using the REST API, because you probably don't want to start ElasticSearch and do not have shards assigned to nodes.

Disabling the allocation of replicas

When the cluster.routing.allocation.disable_replica_allocation property is set to true, it totally disables the allocation of shards and replicas. This can come in handy in situations where you want to operate only on primary shards and don't want ElasticSearch to allocate replicas to nodes.

Note

If you seek more information about shard allocation and the initial shard allocation process, please refer to the Your ElasticSearch time machine and Controlling shard and replica allocation sections in Chapter 7, Administrating Your Cluster.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset