Elasticsearch caches

Until now we haven't mentioned Elasticsearch caches much in the book. However, as most common systems Elasticsearch users a variety of caches to perform more complicated operations or to speed up performance of heavy data retrieval from disk based Lucene indices. In this section, we will look at the most common caches of Elasticsearch, what they are used for, what are the performance implications of using them, and how to configure them.

Fielddata cache

In the beginning of the book, we discussed that Elasticsearch uses the so called inverted index data structure to quickly and efficiently search through the documents. This is very good when searching and filtering the data, but for features such as aggregations, sorting, or script usage, Elasticsearch needs an un-inverted data structure, because these functions rely on per document data information.

Because of the need for uninverted data, when Elasticsearch was first released it contained and still contains an in memory data structure called fielddata. Fielddata is used to store all the values of a given field to memory to provide very fast document based lookup. However, the cost of using fielddata is memory and increased garbage collection. Because of memory and performance cost, starting from Elasticsearch 2.0, each indexed, not analyzed field uses doc values by default. Other fields, such as analyzed text fields, still use fielddata and because of that it is good to know how to handle fielddata.

Fielddata size

Elasticsearch allows us to control how much memory the fielddata cache uses. By default, the cache is unbounded, which is very dangerous. If you have large indices, you may run into memory issues, where the fielddata cache will eat most of the memory given to Elasticsearch and will result in node failure. We are allowed to configure the size of the fielddata cache by using the static indices.fielddata.cache.size property set to an explicit value (like 10GB) or to a percentage of the whole memory given to Elasticsearch (like 20%).

Remember that the fielddata cache is very expensive to build as it needs to load all the values of a given field to memory. This can take a lot of time resulting in degradation in the performance of the queries. Because of this, it is advised to have enough memory to keep the needed cache permanently in Elasticsearch memory. However, we understand that this is not always possible because of hardware costs.

Circuit breakers

The nice thing about Elasticsearch is that it allows us to achieve a similar thing in multiple ways and we have the same situation when it comes to fielddata and limiting the memory usage. Elasticsearch allows us to use a functionality called circuit breakers, which can estimate how much memory a request or a query will use, and if it is above a defined threshold, it won't be executed at all, resulting in no memory usage and an exception thrown. This is very nice when we don't want to limit the size of the fielddata cache but we also don't want a single query to cause memory issues and make the cluster unstable. There are two main circuit breakers: the field data circuit breaker and the request circuit breaker.

The first circuit breaker, the field data one, estimates the amount of memory that will need to be used to load data to the fielddata cache for a given query. We can configure the limit by using the indices.breaker.fielddata.limit property, which is by default set to 60%, which means that a fielddata cache for a single query can't use more than 60 percent of the memory given to Elasticsearch.

The second circuit breaker, the request one, estimates the memory used by per request data structures and prevents them from using more than the amount specified by the indices.breaker.request.limit property. By default, the mentioned property is set to 40%, which means that single request data structures, such as the ones used for aggregation calculation, can't use more than 40% of the memory given to Elasticsearch.

Finally, there is one more circuit breaker that is defined by the indices.breaker.limit.total property (by default set to 70%). This circuit breaker defines the total amount of memory that can be used by both the per request data structures and fielddata.

Remember that the settings for circuit breakers are dynamic and can be updated using cluster update settings.

Fielddata and doc values

As we already discussed, instead of fielddata cache, doc values can be used. Of course, this is only true for not analyzed fields and ones using numeric data types and not multivalued ones. This will save memory and should be faster than the fielddata cache during query time, at the cost of slight indexing speed degradations (very small) and a slightly larger index. If you can use doc values, do that – it will help your Elasticsearch cluster to maintain stability and respond to queries quickly.

Shard request cache

The first of the caches that operates on the queries. The shard request cache caches the aggregations and suggestions resulted by the query, but, when writing this book, it was not caching query hits. When Elasticsearch executes the query, this cache can save the resource consuming aggregations for the query and speed up the subsequent queries by retrieving the aggregations or suggestions from memory.

Note

During the writing of this book, the shard request cache was only used when the size=0 parameter was set for the query. This means that only the total number of hits, aggregation results, and suggestions will be cached. Remember that when running queries with dates and using the now constant, the shard query cache won't also be used.

The shard request cache, as its name says, caches the results of the queries on each shard, before they are returned to the node that aggregates the results. This can be very good when your aggregations are heavy, like the ones that do a lot of computation on the data returned by the query. If you run a lot of aggregations with your queries and the queries can be repeated, think about using the shard request cache as it should help you with queries latency.

Enabling and configuring the shard request cache

The shard request cache is disabled by default, but can be easily enabled. To enable it, we should set the index.requests.cache.enable property to true when creating the index. For example, to enable the shard request cache for an index called new_library, we use the following command:

curl -XPUT 'localhost:9200/new_library' -d '{
 "settings": {
  "index.requests.cache.enable": true
 }
}'

One thing to remember is that the mentioned setting is not dynamically updatable. We need to include it in the index creation command or we can update it when the index is closed.

The maximum size of the cache is specified using the indices.requests.cache.size property and is set to 1% by default (which means 1% of the total memory given to Elasticsearch). We can also specify how long each entry should be kept by using the indices.requests.cache.expire property, but it is not set by default. Also, the cache is invalidated once the index is refreshed (during index searcher reopening), which makes the setting useless most of the time.

Note

Note that in the earlier versions of Elasticsearch, for example in the 1.x branch, to enable or disable this cache, the index.cache.query.enable property was used. This may be important when migrating from older Elasticsearch versions.

Per request shard request cache disabling

Elasticsearch allows us to control the request shard cache used on a per request basis. If we have the mentioned cache enabled, we can still force the search engine to omit caching for such requests. This is done by using the request_cache parameter. If set to true, the request will be cached and, if set to false, the request won't be cached. This is especially useful when we want to cache our requests in general but omit caching for some queries that are rare and not used often. It is also wise for requests that use non-deterministic scripts and time ranges to not be cached.

Shard request cache usage monitoring

If we don't use any monitoring software that allows monitoring the caches usage, we can use Elasticsearch API to check the metrics around the shard request cache. This can be done both at the indices level or at the nodes level.

To check the metrics for the shard request cache for all the indices, we should use the indices stats API and run the following command:

curl 'localhost:9200/_stats/request_cache?pretty'

To check the request cache metrics, but in per node view, we run the following command:

curl 'localhost:9200/_nodes/stats/indices/request_cache?pretty'

Node query cache

The node query cache is responsible for holding the results of queries for the whole node. Its size is defined using indices.queries.cache.size, defaulting to 10%, and is sharable across all the shards present on the node. We can set it both to the percentage of the heap memory given to Elasticsearch, like the default one, or to an explicit value, like 1024mb. One thing to remember about the cache is that its configuration is static, it can't be updated dynamically and should be set in the elasticsearch.yml file. The node query cache uses the least recent used eviction policy, which means that, when full, it removes the data that was used the least.

This cache is very useful when you run queries that are repetetive and heavy, such as the ones used to generate category pages or the main page in an e-commerce application.

Indexing buffers

The last cache we want to discuss is the indexing buffer that allows us to improve indexing throughput. The indexing buffer is divided between all the shards on the node and is used to store newly indexed documents. Once the cache fills up, Elasticsearch flushes the data from the cache to disk, creating a new Lucene segment in the index.

There are four static properties that allow us to configure the indexing buffer size. They need to be set in the elasticsearch.yml file and can't be changed dynamically using the Settings API. These properties are:

  • indices.memory.index_buffer_size: This property defines the amount of memory used by a node for the indexing buffer. It accepts both a percentage value as well as an explicit value in bytes. It defaults to 10%, which means that 10% of the heap memory given to a node will be used as the indexing buffer.
  • indices.memory.min_index_buffer_size: This property defaults to 48mb and specifies the minimum memory that will be used by the indexing buffer. It is useful when indices.memory.index_buffer_size is defined as a percentage value, so that the indexing buffer is never smaller than the value defined by this property.
  • indices.memory.max_index_buffer_size: This property specifies the maximum memory that will be used by the indexing buffer. It is useful when indices.memory.index_buffer_size is defined as a percentage value, so that the indexing buffer never crosses a certain amount of memory usage.
  • indices.memory.min_shard_index_buffer_size: This property defaults to 4mb and sets the hard minimum limit of the indexing buffer that is given to each shard on a node. The indexing buffer for each shard will not be lower than the value set by this property.

When it comes to indexing performance, if you need higher indexing throughput, consider setting the indexing buffer size to a value higher than the default size. It will allow Elasticsearch to flush the data to disk less often and create fewer segments. This will result in less merges, thus less I/O and CPU intensive operations. Because of that, Elasticsearch will be able to use more resources for indexing purposes.

When caches should be avoided

The usual question that may be asked by users is if they should really cache all their requests. The answer is obvious – of course, caches are not the tool for everyone. Using caching is not free – it requires memory and additional operations to put the data to cache or get the data out of there.

What's more, you should remember that Elasticsearch round robins queries between primary shards are replicas, so, if you have replicas, not every request after the first one will use the cache. Imagine that you have an index which has a single primary shard and two replicas. When the first request comes, it will hit a random shard, but the next request, even with the same query, will hit another shard, not the same one (unless routing is used). You should take this into consideration when using caches, because if your queries are not repeated, you may have them running longer because of a cache being used.

So to answer the question if you should use caching or not, we would advise taking your data, taking your queries, and running performance tests using tools such as JMeter (http://jmeter.apache.org). This will let you see how your cluster behaves with real data under a test load and see if the queries are actually faster with or without the caches.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset