When we talk about vertical scaling, we often mean adding more resources to the server Elasticsearch is running on. We can add memory or we can switch to a machine with a better CPU or faster disk storage. Of course, with better machines we can expect an increase in performance; depending on our deployment and its bottlenecks, it can be a small or large improvement. However, there are limitations when it comes to vertical scaling. For example, one of the limitations is the maximum amount of physical memory available for your servers or the total memory required by the JVM to operate. When having large data and complicated queries, you can very soon run into memory issues and adding new memory may not help at all. In this section, we will try to give you general advice on where to look and what to tune when it comes to a single Elasticsearch node.
The thing to remember when tuning your system is performance tests, ones that can be repeated under the same circumstances. Once you make a change, you need to be able to see how it affects the overall performance. In addition to that, Elasticsearch scales great. Using that knowledge, we can run performance tests on a single machine (or a few of them) and extrapolate the results. Such observations may be a good starting point for further tuning.
Also keep in mind that this section doesn't contain a deep dive into all performance related topics, but is dedicated to showing you the most common things.
Apart from all the things we will discuss in this section, there are three major, operating system related things you need to remember: the number of allowed file descriptors, the virtual memory, and avoiding swapping.
Note that the following section contains information for Linux operating systems, but you can also achieve similar options on Microsoft Windows.
Let's start with the third one. Elasticsearch and Java Virtual Machine based applications, in general, don't like to be swapped. This means that these applications work best if the operating system doesn't put the memory that they use in the swap space. This is very simple, because, to access the swapped memory, the operating system will have to read it from the disk, which is slow and which would affect the performance in a very bad way.
If we have enough memory, and we should have if we want our Elasticsearch instance to perform well, we can configure Elasticsearch to avoid swapping. To do that, we just need to modify the elasticsearch.yml
file and include the following property:
bootstrap.mlockall: true
This is one of the options. The second one is to set the property vm.swappiness
in the /etc/sysctl.conf
file to 0
(for complete swap disabling) or 1
for swapping only in emergency (for Kernel versions 3.5 and above).
The third option is to disable swapping by editing /etc/fstab
and removing the lines that contain the swap
word. The following is an example /etc/fstab
content:
LABEL=cloudimg-rootfs / ext4 defaults,discard 0 0 /dev/xvdb swap swap defaults 0 0
To disable swapping we would just remove the second line from the above contents. We could also run the following command to disable swapping:
sudo swapoff -a
However, remember that this effect won't persist between logging off and back in to the system, so this is only a temporary solution.
Also, remember that if you don't have enough memory to run Elasticsearch, the operating system will just kill the process when swapping is disabled.
Make sure you have enough limits related to file descriptors for the user running Elasticsearch (when installing from official packages, that user will be called elasticsearch
). If you don't, you may end up with problems when Elasticsearch tries to flush the data and create new segments or merge segments together, which can result in index corruption.
To adjust the number of allowed file descriptors, you will need to adjust the /etc/security/limits.conf
file (at least on most common Linux systems) and adjust or add an entry related to a given user (for both soft and hard limits). For example:
elasticsearch soft nofile 65536 elasticsearch hard nofile 65536
It is advised to set the number of allowed file descriptors to at least 65536
, but even more can be needed, depending on your index size.
On some Linux systems, you may also need to load an appropriate limits module for the preceding setting to take effect. To load that module, you need to adjust the /etc/pam.d/login
file and add or uncomment the following line:
session required pam_limits.so
There is also a possibility to display the number of file descriptors available for Elasticsearch by adding the -Des.max-open-files=true
parameter to Elasticsearch startup parameters. For example, like this:
bin/elasticsearch -Des.max-open-files=true
When doing that, Elasticsearch will include information about the file descriptors in the logs:
[2015-12-20 00:22:19,869][INFO ][bootstrap ] max_open_files [10240]
Elasticsearch 2.2 uses hybrid directory implementation, which is a combination of mmapfs
and niofs
directories. Because of that, especially when your indices are large, you may need a lot of virtual memory on your system. By default, the operating system limits the amount of memory mapped files and that can cause errors when running Elasticsearch. Because of that, we recommend increasing the default values. To do that, you just need to edit the /etc/sysctl.conf
file and set the vm.max_map_count
property; for example, to a value equal to 262144
.
You can also change the value temporarily by running the following command:
sysctl -w vm.max_map_count=262144
Before thinking about Elasticsearch configuration related things, we should remember about giving enough memory to Elasticsearch. In general, we shouldn't give more than 50-60 percent of the total available memory to the JVM process running Elasticsearch. We do that because we want to leave memory for the operating system and for the operating system I/O cache. However, we need to remember that the 50-60 percent figure is not always true. You can imagine having nodes with 256GB of RAM and having indices of 30GB in total on such a node. In such circumstances, even assigning more than 60 percent of physical RAM to Elasticsearch would leave plenty of RAM for the operating system. It is also a good idea to set the Xmx
and Xms
properties to the same values to avoid JVM heap size resizing.
Another thing to remember are the so called compressed oops (http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop), the ordinary object pointers. Java virtual machine can be told to use them by adding the -XX:+UseCompressedOops
switch. This allows Java virtual machine to use less memory to address the objects on the heap. However, this is only true for heap sizes less than or equal to 31GB. Going for a larger heap means no compressed oops and higher memory usage for addressing the objects on the heap.
As we know, by default the field data cache in Elasticsearch is unbounded. This can be very dangerous, especially when you are using aggregations and sorting on many fields that are analysed, because they don't use doc
values by default. If those fields are high cardinality ones, then you can run into even more trouble. By trouble we mean running out of memory.
We have two different factors we can tune to be sure that we don't run into out of memory errors. First of all, we can limit the size of the field data cache and we should do that. The second thing is the circuit breaker, which we can easily configure to just throw exceptions instead of loading too much data. Combining these two things together will ensure that we don't run into memory issues.
However, we should also remember that Elasticsearch will evict data from the field data cache if its size is not enough to handle aggregation requests or sorting. This will affect the query performance because loading the field data information is not very efficient and is resource intensive. However, in our opinion, it is better to have our queries slower than having our cluster blown up because of out of memory errors.
The field data cache and caches in general were discussed in the Elasticsearch caches section of Chapter 9, Elasticsearch Cluster in Detail.
Whenever you plan to use sorting, aggregations, or scripting heavily, you should use doc
values whenever you can. This will not only save you the memory needed for the field data cache, because of fewer objects produced, it will also make the Java virtual machine work better with lower garbage collector time. Doc values were discussed in the Mappings Configuration section of Chapter 2, Indexing Your Data.
In the Elasticsearch caches section of Chapter 9, Elasticsearch Cluster in Detail, we also discussed. There are a few things we would like to mention. First of all, the more RAM for the indexing buffer, the more documents Elasticsearch will be able to hold in memory. So the more memory we have for indexing, the less often the flush to disk will happen and fewer segments will be created. Because of that, your indexing will be faster. But of course, we don't want Elasticsearch to occupy 100 percent of the available memory. Keep in mind that the RAM buffers are set per shard, so the amount of memory that will be used depends on the number of shards and replicas that are assigned on the given node and on the number of documents you index. You should set the upper limits so your node doesn't blow up when it has multiple shards assigned.
Elasticsearch uses Lucene and we know it by now. The thing with Lucene is that the view of the index is not refreshed when new data is indexed or segments are created. To see the newly indexed data, we need to refresh the index. By default, Elasticsearch does that once every second and the period of refresh is controlled by using the index.refresh_interval
property, specified per index. The lower the refresh rate, the sooner the documents will be visible for search operations. However, that also means that Elasticsearch will need to put more resources in to refreshing the index view, meaning that the indexing and searching operations will be slower. The higher the refresh rate, the more time you will have to wait before being able to see the data in the search results, but your indexing and querying will be faster.
We haven't talked about thread pools until now, but we would like to mention them now. Each Elasticsearch node holds several thread pools that control the execution queues for operations such as indexing or querying. Elasticsearch uses several pools to allow control over how the threads are handled and much the memory consumption is allowed for user requests.
Java virtual machine allows applications to use multiple threads - concurrently running multiple application tasks. For more information about Java threads, refer to http://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html.
There are many thread pools (we can specify the type we are configuring by specifying the type
property). However, for performance, the most important are:
generic
: This is the thread pool for generic operations, such as node discovery. By default, the generic
thread pool is of type cached
.index
: This is the thread pool used for indexing and deleting operations. Its type defaults to fixed
, its size
to the number of available processors, and the size of the queue to 200
.search
: This is the thread pool used for search and count requests. Its type defaults to fixed
and its size
to the number of available processors multiplied by 3 and divided by 2, with the size of the queue defaulting to 1000
.suggest
: This is the thread pool used for suggest requests. Its type defaults to fixed
, its size
to the number of available processors, and the size of the queue to 1000
.get
: This is the thread pool used for real time get
requests. Its type defaults to fixed
, its size
to the number of available processors, and the size of the queue to 1000
.bulk
: As you can guess, this is the thread pool used for bulk operations. Its type defaults to fixed
, its size
to the number of available processors, and the size of the queue to 50
.percolate
: This is the thread pool for percolation requests. Its type defaults to fixed
, its size
to the number of available processors, and the size of the queue to 1000
.Before Elasticsearch 2.1, we could control the type of the thread pool. Starting with Elasticsearch 2.1 we can no longer do that. For more information please refer to the official documentation - https://www.elastic.co/guide/en/elasticsearch/reference/2.1/breaking_21_removed_features.html.
For example, if we want to configure the thread pool for indexing operations to have a size of 100
and a queue of 500
, we will set the following in the elasticsearch.yml
configuration file:
threadpool.index.size: 100 threadpool.index.queue_size: 500
Also remember that the thread pool configuration can be updated using the cluster update API. For example, like this:
curl -XPUT 'localhost:9200/_cluster/settings' -d '{ "transient" : { "threadpool.index.size" : 100, "threadpool.index.queue_size" : 500 } }'
In general, you don't need to work with the thread pools and their configuration. However, when configuring your cluster, you may want to put more emphasis on indexing or querying and, in such cases, giving more threads or larger queues to the prioritized operation may result in more resources being used for such operations.