Horizontal expansion

Elasticsearch is a highly scalable search and analytics platform. We can scale it both horizontally and vertically. We discussed how to tune a single node in the Preparing a single Elasticsearch node section earlier in this chapter and we would like to focus on horizontal scaling now; how to handle multiple nodes in the same cluster, what roles should they have, and how to tune the configuration to have a highly reliable, available, and fault tolerant cluster.

You can imagine vertical scaling like building a sky scrapper – we have limited space available and we need to go as high as we can. Of course, that is expensive and requires a lot of engineering done right. On the other hand, we have horizontal scaling, which is like having many houses in a residential area. Instead of investing into hardware and having powerful machines, we choose to have multiple machines and our data split between them. Horizontal scaling gives us virtually unlimited scaling possibilities. Even with the most powerful hardware, the time comes when a single machine is not enough to handle the data, the queries, or both of them. In such cases, spreading the data among multiple servers is what saves us and allows us to have terabytes of data in multiple indices spread across the whole cluster, just like the one in the following image:

Horizontal expansion

We have our 4 nodes cluster with the library index created and built of four shards.

If we want to increase the querying capabilities of our cluster, we can just add additional nodes, for example, four of them. After adding new nodes to the cluster, we can either create new indices that will be built of more shards to spread the load more evenly or add replicas to the already existing shards. Both options are viable. This is because we don't have the possibility of splitting shards or adding more primary shards to an existing index. We should go for having more primary shards when our hardware is not enough to handle the amount of data it holds. In such cases, we usually run into out of memory situations, long shard query execution time, swapping, or high I/O waits. The second option, that is having replicas, is the way to go when our hardware is happily handling the data we have but the traffic is so high that the nodes just can't keep up.

The first option is simple, but let's looks at the second case - having more replicas. So with four additional nodes, our cluster would look as follows:

Horizontal expansion

Now, let's run the following command to add a single replica:

curl -XPUT 'localhost:9200/library/_settings' -d '{
 "index" : {
  "number_of_replicas" : 1
 }
}'

Our cluster view would look more or less as follows:

Horizontal expansion

As you can see, each of the initial shards building the library index has a single replica stored on another node. The nice thing about shards and their replicas is that Elasticsearch is smart enough to balance the shards in a single index and put them on separate nodes. For example, you won't ever end up in a situation where you have a shard and its replicas on the same node. Also, Elasticsearch is able to round robin the queries between the shards and their replicas, which means that all the nodes will be hit by the queries and we don't have to care about that. Because of that, we are able to handle almost double the query load compared to our initial deployment.

Automatically creating the replicas

Let's stay a bit longer around replicas. Elasticsearch allows us to automatically expand replicas when the cluster is big enough. This means that the replicas can be created automatically when new nodes are added to the cluster. You can wonder where such functionality can be useful. Imagine a situation where you have a small index that you would like to be present on every node so that your plugins don't have to run distributed queries just to get the data from it. In addition to that, your cluster is dynamically changing, that is you add and remove nodes from it. The simplest way to achieve such functionality is to allow Elasticsearch to automatically expand the replicas. To do that, we need to set index.auto_expand_replicas to 0-all, which means that the index can have 0 replicas or be present on all the nodes. So if our small index is called shops and we would like Elasticsearch to automatically expand its replicas to all the nodes in the cluster, we would use the following command to create the index:

curl -XPOST 'localhost:9200/shops/' -d '{
 "settings" : {
  "index" : {
   "auto_expand_replicas" : "0-all"
  }
 }
}'

We can also update the settings of that index if it is already created by running the following command:

curl -XPUT 'localhost:9200/shops/_settings' -d '{
 "index" : {
  "auto_expand_replicas" : "0-all"
 }
}'

Redundancy and high availability

The Elasticsearch replication mechanism not only gives us ability to handle higher query throughput, but also gives us redundancy and high availability. Imagine an Elasticsearch cluster hosting a single index called library that is built of 2 shards and 0 replicas. Such a cluster would look as follows:

Redundancy and high availability

Now what happens when one of the nodes fail? The simplest answer is that we lose about 50 percent of the data and, if the failure is fatal, we lose that data forever. Even when having backups, we would need to spin up another node and restore the backup and that takes time. During that time, your application, or parts of it that are based on Elasticsearch, can't work correctly. If your business relies on Elasticsearch, downtime means money loss. Of course, we can use replicas to create more reliable clusters that can handle the hardware and software failures. And one thing to remember is that everything will fail eventually – if the software won't, hardware will. For example, some time ago Google said that in each of their clusters, during the first year at least 1000 machines will fail (you can read more on that topic at http://www.cnet.com/news/google-spotlights-data-center-inner-workings/). Because of that, we need to be ready to handle such cases.

Let's look at the same cluster but with one replica:

Redundancy and high availability

Now losing a single Elasticsearch node means that we still have the whole data available and we can work on restoring the full cluster structure without downtime. Of course, this is only a very small cluster built of two Elasticsearch nodes clusters. The larger the cluster, the more replicas, the more failure you will be able to handle without worrying about the data loss. Of course you will have lower performance, depending on the percentage of nodes that fail, but the data will still be there and the cluster will be operational.

That's why, when designing your architecture and deciding on the number of nodes and indices and their architecture, you should take into consideration how many nodes, failure you want to live with. Of course, you can't forget about the performance part of the equation, but redundancy and high availability should be one of the factors of the scaling equation.

Cost and performance flexibility

The default distributed nature of Elasticsearch and its ability to scale horizontally allows us to be flexible when it comes to performance and costs that we have when running our environment. First of all, high end servers with high performance disks, numerous CPU cores, and a lot of RAM are still expensive. In addition to that, cloud computing is getting more and more popular and if you need a lot of flexibility and don't want to have your own hardware, you can choose solutions such as Amazon (http://aws.amazon.com/), Rackspace (http://www.rackspace.com/), DigitalOcean (https://www.digitalocean.com/), and so on. They do not only allow us to run our software on rented machines, but also allow us to scale on demand. We just need to add more machines which is a few clicks away or can even be automated with some degree of work.

Using a hosted solution with one click machine renting allows having a truly horizontally scalable solution. Of course, that's not cheap – you pay for the flexibility. But we can easily sacrifice performance if costs are the most crucial factor in our business plan. Of course, we can also go the other way. If we can afford large bare metal machines, Elasticsearch clusters can be pushed to hundreds of terabytes of data stored in the indices and still get decent performance (of course with a proper hardware and property distributed).

Continuous upgrades

High availability, cost and performance flexibility, and virtually endless growth are not the only things worth talking about when discussing the scalability side of Elasticsearch. At some point in time, you will want to have your Elasticsearch cluster upgraded to a new version. It can be because of bug fixes, performance improvements, new features, or anything that you can think of. The thing is that when you have a single instance of each shard, without replicas, an upgrade means unavailability of Elasticsearch (or at least its parts) and that may mean downtime of the applications that use Elasticsearch. This is another reason why horizontal scaling is so important; you can perform upgrades, at least to the point where software such as Elasticsearch supports. For example, you can take Elasticsearch 2.0 and upgrade to Elasticsearch 2.1 with only rolling restarts (getting one node out of the cluster, upgrading it, bringing it back, and continuing with the next node until all the nodes are done), thus having all the data still available for searching and indexing happening at the same time.

Multiple Elasticsearch instances on a single physical machine

Having a large physical machine with lot of memory and CPU cores has advantages and some challenges. First of all, if you decide to run a single Elasticsearch node on that machine, you will sooner or later run into garbage collection issues, you will have lots of shards on a single node which will require a high number of I/O operations for the internal Elasticsearch communication (retrieving cluster statistics), and so so. What's more, you usually shouldn't go above 31GB of heap memory for a single JVM process because you can't use compressed ordinary object pointers (https://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html).

In such cases, you can either run multiple Elasticsearch instances on the same bare metal machine, run multiple virtual machines and a single Elasticsearch inside each one, or run Elasticsearch in a container, such as Docker (http://www.docker.com/). This is out of the scope of the book, but, because we are talking about scaling, we thought it may be a good thing to mention what can be done in such cases.

Note

There is also the possibility of running multiple Elasticsearch servers on a single physical machine without running multiple virtual machines. Which road to take - virtual machines or multiple instances - is really your choice. However, we like to keep things separate and because of that we usually go for dividing any large server into multiple virtual machines. When dividing one large server into multiple smaller virtual machines, remember that the I/O subsystem will be shared across those smaller virtual machines. Because of that, it may be good to wisely divide the disks between the virtual machines.

Preventing a shard and its replicas from being on the same node

There is one additional thing worth mentioning. When you have multiple physical servers divided into virtual machines, it is crucial to ensure that the shard and its replica don't end up on the same physical machine. By default, Elasticsearch is smart enough to not put the shard and its replica on the same Elasticsearch instance, but it doesn't know anything about bare metal machines, so we need to tell it. We can tell Elasticsearch to separate the shards and replicas by using cluster allocation awareness. In our previous case, we had three physical servers. Let's call them: server1, server2, and server3.

Now for each Elasticsearch on a physical server, we define the node.server_name property and we set it to the identifier of the server (the name of the property can be anything we want). So for example, for all Elasticsearch nodes on the first physical server, we would set the following property in the elasticsearch.yml configuration file:

node.server_name: server1

In addition to that, each Elasticsearch node (no matter on which physical server) needs to have the following property added to the elasticsearch.yml configuration file:

cluster.routing.allocation.awareness.attributes: server_name

It tells Elasticsearch not to put the primary shard and its replicas on the nodes with the same value in the node.server_name property. This is enough for us and Elasticsearch will take care of the rest.

Designated node roles for larger clusters

There is one more thing that we want to discuss and emphasise. When it comes to large clusters, it is important to assign roles to all the nodes in the cluster. This allows for a truly fully fault tolerant and highly available Elasticsearch cluster. The roles we can assign to each Elasticsearch node are as follows:

  • Master eligible node
  • Data node
  • Query aggregator node

By default, each Elasticsearch node is both master eligible (it can serve as a master node), can hold data, and work as a query aggregator node. You may wonder why that is needed. Let us give you a simple example: if the master node is under a lot of stress, it may not be able to handle the cluster state related command fast enough and the cluster could become unstable. This is only a single, simple example and you can think of numerous others.

Because of that, most Elasticsearch clusters that are larger than a few nodes, usually look like the one presented in the following picture:

Designated node roles for larger clusters

As you can see, our hypothetical cluster contains three client nodes (because we know that there will be a lot of queries), a large number of data nodes because the amount of data will be large, and at least three master eligible nodes that shouldn't be doing anything else. Why three master nodes when Elasticsearch will only use a single one at any given time? Again, because of redundancy and to be able to prevent split brain situations by setting discovery.zen.minimum_master_nodes to 2, which would allow us to easily handle the failure of a single master eligible node in the cluster.

Let us now give you snippets of the configuration for each type of node in our cluster. We already talked about that in the Understanding node discovery section in Chapter 9, Elasticsearch Cluster in Detail, but we would like to mention that once again.

Query aggregator nodes

The query aggregator nodes configuration is quite simple. To configure those, we just need to tell Elasticsearch that we don't want those nodes to be master eligible or to hold data. This corresponds to the following configuration snippets in the elasticsearch.yml file:

node.master: false
node.data: false

Data nodes

Data nodes are also very simple to configure. We just need to tell that they should not be master eligible. However, we are not big fans of default configurations (because they tend to change) and thus our Elasticsearch data nodes configuration looks as follows:

node.master: false
node.data: true

Master eligible nodes

We've left the master eligible nodes to the end of the general scaling section. Of course, such Elasticsearch nodes shouldn't be allowed to hold data, but, in addition to that, it is a good practice to disable HTTP protocol on such nodes. This is done to avoid accidentally querying those nodes. Master eligible nodes can use less resources than data and query aggregator nodes and because of that we should ensure that they are only used for master related purpose. So our configuration for master eligible nodes looks more or less as follows:

node.master: true
node.data: false
http.enabled: false
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset