Warming up

Sometimes, there may be a need to prepare ElasticSearch to handle your queries. Maybe it's because you rely heavily on the field data cache and you want it to be loaded before your production queries arrive or maybe you want to warm up your operating system's I/O cache. Whatever the reason is, ElasticSearch allows us to define the warming queries for our types and indices.

Defining a new warming query

A warming query is nothing more than the usual query stored in a special index in ElasticSearch called _warmer. Let's assume we have the following query that we want to use for warming up:

{
  "query" : {
    "match_all" : {}
  },
  "facets" : {
    "warming_facet" : {
      "terms" : {
        "field" : "tags"
      }
    }
  }
}

In order to store the preceding query as a warming query for our library index, we will run the following command:

curl -XPUT 'localhost:9200/library/_warmer/tags_warming_query' -d '{
  "query" : {
    "match_all" : {}
  },
  "facets" : {
    "warming_facet" : {
      "terms" : {
        "field" : "tags"
      }
    }
  }
}'

The preceding command will register our query as a warming query with the name tags_warming_query. You can have multiple warming queries for your index, but each of those queries needs to have a unique name.

We can also define warming queries not only for the whole index, but also for the specific types in it. For example, if we want to store our previously shown query as the warming query only for the book type in the library index, we will run the preceding command not to the /library/_warmer URI, but to the /library/book/_warmer one, so the whole command will be as follows:

curl -XPUT 'localhost:9200/library/book/_warmer/tags_warming_query' -d '{
  "query" : {
    "match_all" : {}
  },
  "facets" : {
    "warming_facet" : {
      "terms" : {
        "field" : "tags"
      }
    }
  }
}'

After adding a warming query, before ElasticSearch allows a new segment to be searched on, it will be warmed up by running the defined warming queries on that segment. This allows ElasticSearch and the operating system to cache data and thus speed up searching.

Note

If you are not familiar with the Apache Lucene library, you may not know what a segment is. Lucene divides the index into parts called segments, which once written can't be changed. Every new commit operation creates a new segment (which is eventually merged if the number of segments is too high), which Lucene uses for search.

Retrieving defined warming queries

In order to get a specific warming query for our index, we just need to know its name. For example, if we want to get the warming query named tags_warming_query for our library index, we will run the following command:

curl -XGET 'localhost:9200/library/_warmer/tags_warming_query?pretty=true'

And the result returned by ElasticSearch will be as follows (note that we've used the pretty=true parameter to make the response easier to read):

{
  "library" : {
    "warmers" : {
      "tags_warming_query" : {
        "types" : [ ],
        "source" : {
          "query" : {
            "match_all" : { }
          },
          "facets" : {
            "warming_facet" : {
              "terms" : {
                "field" : "tags"
              }
            }
          }
        }
      }
    }
  }
}

We can also get all the warming queries for the index and type by using the following command:

curl -XGET 'localhost:9200/library/_warmer'

We can also get all the warming queries for a specific type—for example, if we want to get all the warming queries for the library index and the book type, we will run the following query:

curl -XGET 'localhost:9200/library/book/_warmer'

And finally, we can also get all the warming queries that start with a given prefix. For example, if we want to get all the warming queries for the library index that start with the tags prefix, we will run the following command:

curl -XGET 'localhost:9200/library/_warmer/tags*'

Deleting a warming query

Deleting a warming query is very similar to getting one, but we just need to use the DELETE HTTP method. Let's look at how to delete a warming query.

In order to delete a specific warming query from our index, we just need to know its name. For example, if we want to delete the warming query named tags_warming_query for our library index, we will run the following command:

curl -XDELETE 'localhost:9200/library/_warmer/tags_warming_query'

We can also delete all the warming queries for the index by using the following command:

curl -XDELETE 'localhost:9200/library/_warmer'

And finally, we can also remove all the warming queries that start with a given prefix. For example, if we want to remove all the warming queries for the library index that start with the tags prefix, we will run the following command:

curl -XDELETE 'localhost:9200/library/_warmer/tags*'

Disabling the warming up functionality

In order to disable the warming queries totally, but save them in the _warmer index, one should set the index.warmer.enabled configuration property to false (setting this property to true will result in enabling the warming up functionality). This setting can be either put into the elasticsearch.yml file or just set using the REST API on a live cluster.

For example, if we want to disable the warming up functionality for the library index, we will run the following command:

curl -XPUT 'http://localhost:9200/library/_settings' -d '{
  "index.warmer.enabled" : false
}'

Which queries to choose

You may ask which queries should be used as the warming queries—typically, you'll want to choose the ones that are expensive to execute and ones that require caches to be populated—so you'll probably want to choose the queries that include faceting and sorting, based on the fields in your index. Those are the usual candidates. However, you may also choose other queries by looking at the logs and finding where your performance is not as great as you want it to be. Such queries may also be perfect candidates for warming up.

For example, let's say that we have the following logging configuration set in the elasticsearch.yml file:

index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s
index.search.slowlog.threshold.query.debug: 2s
index.search.slowlog.threshold.query.trace: 1s

And we have the following logging level set in the logging.yml configuration file:

logger:
 index.search.slowlog: TRACE, index_search_slow_log_file

Notice that the index.search.slowlog.threshold.query.trace property is set to 1s and the index.search.slowlog logging level is set to TRACE. That means whenever a query is executed for more than one second (on a shard, not in total), it will be logged into the slow log file (the name of which is specified by the index_search_slow_log_file configuration section of the logging.yml configuration file). For example, the following can be found in a slow log file:

[2013-01-24 13:33:05,518][TRACE][index.search.slowlog.query] [Local test] [library][1] took[1400.7ms], took_millis[1400], search_type[QUERY_THEN_FETCH], total_shards[32], source[{"query":{"match_all":{}}}], extra_source[]

As you can see, in the preceding log line, we have the query time, search type, and the query source itself, which shows us the executed query.

Of course, the values can be different in your configuration, but the slow log can be a valuable source of queries that are running too long and may need to have some warm up defined—maybe those are parent-child queries and need some identifiers fetched to perform better or maybe you are using a filter that is expensive when executing for the first time?

Note

There is one thing you should remember; don't overload your ElasticSearch cluster with too many warming queries, because you may end up spending too much time warming up instead of processing your production queries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset