In ELK, although Logstash and Kibana act as an interface to talk to Elasticsearch indices, it's still necessary to understand how Logstash and Kibana makes use of Elasticsearch RESTful APIs to perform various operations, such as creating and managing indices, storing and retrieving the documents, and forming various types of search queries around the indices. It is also often useful to know how to delete indices.
As we already know, Elasticsearch provides an extensive API to perform various operations. The generic syntax of querying the cluster from the command line is as follows:
$curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>/<OPERATION_NAME>?<QUERY_STRING>' -d '<BODY>'
Let's understand various parts of this command:
VERB
: This can take values for the request method type: GET
, POST
, PUT
, DELETE
, HEAD
.PROTOCOL
: This is either http
or https
.HOST
: This is the hostname of the node in the cluster. For local installations, this can be 'localhost'
or '127.0.0.1'
.PORT
: This is the port on which the Elasticsearch instance is currently running. The default is 9200
.PATH
: This corresponds to the name of the index, type, and ID to be queried, for example: /index/type/id
.OPERATION_NAME
: This corresponds to the name of the operation to be performed, for example: _search
, _count
, and so on.QUERY_STRING
: This is an optional parameter to be specified for query string parameters. For example, ?pretty
for pretty print of JSON documents.BODY
: This makes a request for body text.Let's take the following command as an example:
curl –XGET 'http://localhost:9200/logstash-2014.08.04/_search?pretty'
This URL will search in the index named logstash-2014.08.04
.
For the upcoming sections, it is assumed that you have already installed Elasticsearch as explained in Chapter 1, Introduction to ELK Stack, and it is running.
In this section, we will make use of the indices created in our example in Chapter 2, Building Your First Data Pipeline with ELK, and will try to perform some operations on them.
Let's first try to see all available indices in our cluster by executing the following command:
curl –XGET 'localhost:9200/_cat/indices?v'
Upon executing this, we will get the following response:
health status index pri rep docs.count docs.deleted store.size pri.store.size green open logstash-2014.12.19 5 1 1 0 6.1kb 6.1kb green open logstash-2014.12.08 5 1 1 0 6.1kb 6.1kb green open logstash-2014.07.17 5 1 1 0 6kb 6kb green open logstash-2014.08.04 5 1 1 0 6.1kb 6.1kb green open logstash-2014.11.05 5 1 1 0 6.1kb 6.1kb green open logstash-2014.07.27 5 1 1 0 6.1kb 6.1kb green open logstash-2014.09.16 5 1 1 0 6.1kb 6.1kb green open logstash-2014.12.15 5 1 1 0 6.1kb 6.1kb green open logstash-2014.12.10 5 1 1 0 6.1kb 6.1kb green open logstash-2014.09.18 5 1 1 0 6kb 6kb green open logstash-2014.12.18 5 1 1 0 6.1kb 6.1kb green open logstash-2014.07.08 5 1 1 0 6.1kb 6.1kb
This will show all the indices that are stored among all nodes in the cluster, and some information about them such as health, index name, size, count of documents, number of primary shards, and so on.
For example, the first row in the preceding text shows that we have 5
primary and 1
replica shards of the index named logstash-2014.12.19
and it has 1
document in it and 0
deleted documents.
We can also see all nodes in a cluster by invoking the following command:
curl –XGET 'http://localhost:9200/_cat/nodes?v'
The response is as follows:
host ip heap.percent ram.percent load node.role master name packtpub 127.0.1.1 18 35 0.27 d * Animus
Since ours is a single node cluster on localhost
, it shows one node and the memory related characteristics of this node.
We can check the health of a cluster by invoking the following command:
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true' { "cluster_name" : "elasticsearch", "status" : "yellow", "timed_out" : false, "number_of_nodes" : 1, "number_of_data_nodes" : 1, "active_primary_shards" : 11, "active_shards" : 11, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 11 }
Health can be checked at cluster level, shard level, or indices level, using URLs that are similar to the following ones:
curl -XGET 'http://localhost:9200/_cluster/health?level=cluster&pretty=true' curl -XGET 'http://localhost:9200/_cluster/health?level=shards&pretty=true' curl -XGET 'http://localhost:9200/_cluster/health?level=indices&pretty=true'
Elasticsearch cluster health is indicated in three parameters:
In ELK, index creation is automatically handled by providing the index name in the Logstash elasticsearch
output plugin. Still, let's take a look at how we can create an index:
curl -XPUT 'localhost:9200/<index_name>?pretty'
For example, to create an index named packtpub
, we can issue the following command:
curl –XPUT 'localhost:9200/packtpub/?pretty'
We can also directly create an index while putting the document inside the index as follows:
curl –xPUT 'localhost:9200/packtpub/elk/1?pretty' –d ' { book_name : "learning elk" }'
The response of the preceding command is:
{ "_index" : "packtpub", "_type" : "elk", "_id" : "1", "_version" : 1, "created" : true }
With the preceding command, a new index named packtpub
was created along with type elk
, and a document with ID 1
was stored in it.
We will now retrieve the document that we just indexed:
curl -XGET 'localhost:9200/packtpub/elk/1?pretty'
The response of the preceding query will be:
{ "_index" : "packtpub", "_type" : "elk", "_id" : "1", "_version" : 1, "found" : true, "_source":{ book_name : "learning elk" } }
The _source
field will contain a full document, which was indexed with ID as 1
.
From our GOOG
price indices example from Chapter 2, Building Your First Data Pipeline with ELK, let's try to query for a document:
curl –XGET 'localhost:9200/logstash-2014.08.04/logs/_search?pretty'
This will give us the following response:
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "logstash-2014.08.04", "_type" : "logs", "_id" : "AU2qgZixPoayDyQnreXd", "_score" : 1.0, "_source":{"message":["2014-08-05,570.05255,571.9826,562.61255,565.07257,1551200,565.07257"],"@version":"1","@timestamp":"2014-08-04T23:00:00.000Z","host":"packtpub","path":"/opt/logstash/input/GOOG.csv","date_of_record":"2014-08-05","open":570.05255,"high":571.9826,"low":562.61255,"close":565.07257,"volume":1551200,"adj_close":"565.07257"} } ] } }
We got the complete message stored as the _source
field, which contains JSON emitted from Logstash.