Warming up

Sometimes, there may be a need to prepare Elasticsearch to handle your queries. Maybe it's because you heavily rely on the field data cache and you want it to be loaded before your production queries arrive, or maybe you want to warm up your operating system's I/O cache so that the data indices files are read from the cache. Whatever the reason, Elasticsearch allows us to use so called warming queries for our types and indices.

Defining a new warming query

A warming query is nothing more than the usual query stored in a special type called _warmer in Elasticsearch. Let's assume that we have the following query that we want to use for warming up:

curl -XGET localhost:9200/library/_search?pretty -d '{
  "query" : {
    "match_all" : {}
  },
  "aggs" : {
    "warming_aggs" : {
      "terms" : {
        "field" : "tags"
      }
    }
  }
}'

To store the preceding query as a warming query for our library index, we will run the following command:

curl -XPUT 'localhost:9200/library/_warmer/tags_warming_query' -d '{
  "query" : {
    "match_all" : {}
  },
  "aggs" : {
    "warming_aggs" : {
      "terms" : {
        "field" : "tags"
      }
    }
  }
}'

The preceding command will register our query as a warming query with the tags_warming_query name. You can have multiple warming queries for your index, but each of these queries needs to have a unique name.

We can not only define warming queries for the entire index, but also for the specific type in it. For example, to store our previously shown query as the warming query only for the book type in the library index, run the preceding command not to the /library/_warmer URI but to /library/book/_warmer. So, the entire command will be as follows:

curl -XPUT 'localhost:9200/library/book/_warmer/tags_warming_query' -d '{
  "query" : {
    "match_all" : {}
  },
  "aggs" : {
    "warming_aggs" : {
      "terms" : {
        "field" : "tags"
      }
    }
  }
}'

After adding a warming query, before Elasticsearch allows a new segment to be searched on, it will be warmed up by running the defined warming queries on that segment. This allows Elasticsearch and the operating system to cache data and, thus, speed up searching.

Just as we read in the Full text searching section of Chapter 1, Getting Started with Elasticsearch Cluster, Lucene divides the index into parts called segments, which once written can't be changed. Every new commit operation creates a new segment (which is eventually merged if the number of segments is too high), which Lucene uses for searching.

Note

Please note that the Warmer API will be removed in the future versions of Elasticsearch.

Retrieving the defined warming queries

In order to get a specific warming query for our index, we just need to know its name. For example, if we want to get the warming query named as tags_warming_query for our library index, we will run the following command:

curl -XGET 'localhost:9200/library/_warmer/tags_warming_query?pretty'

The result returned by Elasticsearch will be as follows:

{
  "library" : {
    "warmers" : {
      "tags_warming_query" : {
        "types" : [ "book" ],
        "source" : {
          "query" : {
            "match_all" : { }
          },
          "aggs" : {
            "warming_aggs" : {
              "terms" : {
                "field" : "tags"
              }
            }
          }
        }
      }
    }
  }
}

We can also get all the warming queries for the index and type using the following command:

curl -XGET 'localhost:9200/library/_warmer?pretty'

And finally, we can also get all the warming queries that start with a given prefix. For example, if we want to get all the warming queries for the library index that start with the tags prefix, we will run the following command:

curl -XGET 'localhost:9200/library/_warmer/tags*?pretty'

Deleting a warming query

Deleting a warming query is very similar to getting one; we just need to use the DELETE HTTP method. To delete a specific warming query from our index, we just need to know its name. For example, if we want to delete the warming query named tags_warming_query for our library index, we will run the following command:

curl -XDELETE 'localhost:9200/library/_warmer/tags_warming_query'

We can also delete all the warming queries for the index using the following command:

curl -XDELETE 'localhost:9200/library/_warmer/_all'

And finally, we can also remove all the warming queries that start with a given prefix. For example, if we want to remove all the warming queries for the library index that start with the tags prefix, we will run the following command:

curl -XDELETE 'localhost:9200/library/_warmer/tags*'

Disabling the warming up functionality

To disable the warming queries totally but to save them in the _warmer index, you should set the index.warmer.enabled configuration property to false (setting this property to true will result in enabling the warming up functionality). This setting can be either put in the elasticsearch.yml file or just set using the REST API on a live cluster.

For example, if we want to disable the warming up functionality for the library index, we will run the following command:

curl -XPUT 'localhost:9200/library/_settings' -d '{
  "index.warmer.enabled" : false
}'

Choosing queries for warming

Finally, we should ask ourselves one question: which queries should be considered as candidates for warming. Typically, you'll want to choose ones that are expensive to execute and ones that require caches to be populated. So you'll probably want to choose queries that include aggregations and sorting based on the fields in your index. This will force the operating system to load the part of the indices that hold the data related to such queries and improve the performance of consecutive queries that are run. In addition to this, parent-child queries and nested queries are also potential candidates for warming. You may also choose other queries by looking at the logs, and finding where your performance is not as great as you want it to be. Such queries may also be perfect candidates for warming up.

For example, let's say that we have the following logging configuration set in the elasticsearch.yml file:

   index.search.slowlog.threshold.query.warn: 10s
   index.search.slowlog.threshold.query.info: 5s
   index.search.slowlog.threshold.query.debug: 2s
   index.search.slowlog.threshold.query.trace: 1s

And we have the following logging level set in the logging.yml configuration file:

 logger: 
    index.search.slowlog: TRACE, index_search_slow_log_file

Notice that the index.search.slowlog.threshold.query.trace property is set to 1s and the index.search.slowlog logging level is set to TRACE. This means that whenever a query is executed for longer than one second (on a shard, not in total), it will be logged into the slow log file (the name of which is specified by the index _search_slow_log_file configuration section of the logging.yml configuration file). For example, the following can be found in a slow log file:

[2015-11-25 19:53:00,248][TRACE][index.search.slowlog.query] took[340000.2ms], took_millis[3400], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"match_all":{}},"aggs":{"warming_aggs":{"terms":{"field":"tags"}}}}], extra_source[],

As you can see, in the preceding log line, we have the query time, search type, and the query source, which shows us the executed query.

Of course, the values can be different in your configuration but the slow log can be a valuable source of the queries that have been running too long and may need to have some warm up defined; maybe these are parent-child queries and need some identifiers to be fetched to perform better, or maybe you are using a filter that is expensive when you execute it for the first time.

There is one thing you should remember: don't overload your Elasticsearch cluster with too many warming queries because you may end up spending too much time in warming up instead of processing your production queries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset