Memory pressure and implications

Aggregations are awesome! However, they bring a lot of memory pressure on Elasticsearch. They work on an in-memory data structure called fielddata, which is the biggest consumer of HEAP memory in a Elasticsearch cluster. Fielddata is not only used for aggregations, but also used for sorting and scripts. The in-memory fielddata is slow to load, as it has to read the whole inverted index and un-invert it. If the fielddata cache fills up, old data is evicted causing heap churn and bad performance (as fielddata is reloaded and evicted again.)

The more unique terms exist in the index, the more terms will be loaded into memory and the more pressure it will have. If you are using an Elasticsearch version below 2.0.0 and above 1.0.0, then you can use the doc_vlaues parameter inside the mapping while creating the index to avoid the use of fielddata using the following syntax:

PUT /index_name/_mapping/index_type
{
  "properties": {
    "field_name": {
      "type": "string",
      "index": "not_analyzed",
      "doc_values": true
    }
  }
}

Note

doc_values have been enabled by default from Elasticsearch version 2.0.0 onwards.

The advantages of using doc_values are as follows:

  • Less heap usage and faster garbage collections
  • No longer limited by the amount of fielddata that can fit into a given amount of heap—instead the file system caches can make use of all the available RAM
  • Fewer latency spikes caused by reloading a large segment into memory

The other important consideration to keep in mind is not to have a huge number of buckets in a nested aggregation. For example, finding the total order value for a country during a year with an interval of one week will generate 100*51 buckets with the sum value. It is a big overhead that is not only calculated in data nodes, but also in the co-ordinating node that aggregates them. A big JSON also gives problems on parsing and loading on the "frontend". It will easily kill a server with wide aggregations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset