In this section, we'll configure Logstash to read data from access logs located on Tomcat, and index it in Elasticsearch, making filters and tokenization of terms in logs as per the grok pattern.
As we already saw, some of the commonly used grok patterns are already included with the Logstash installation. Check out the list of Logstash grok patterns on GitHub at https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns.
There is already a grok pattern for the Common Apache log format in the Logstash installation as follows:
COMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)
We can directly use COMMONAPACHELOG
as a matching pattern for our incoming messages to Logstash as follows:
input{ file{ path =>"/var/lib/tomcat7/logs/localhost_access_logs.txt" start_position =>"beginning" } }
Next, we need to specify our grok pattern matching with the incoming message, assign a timestamp field from our message, and convert the data types of some of the fields as per our needs:
filter{ grok { match => { "message" => "%{COMMONAPACHELOG}" } } date{ match => ["timestamp","dd/MMM/yyyy:HH:mm:ss Z"] } mutate{ convert => ["response","integer"] convert => ["bytes","integer"] } }
Finally, to configure the output plugin to send filtered messages to Elasticsearch, we will not specify any port here as we are using the default port for Elasticsearch, that is, 9200
:
output{ elasticsearch { host => "localhost" } }
Now that we have understood the individual configuration, let's see what the overall configuration for Tomcat looks like:
input{ file{ path =>"/var/lib/tomcat7/logs/localhost_access_log.txt" start_position =>"beginning" } } filter{ grok { match => { "message" => "%{COMMONAPACHELOG}" } } date{ match => ["timestamp","dd/MMM/yyyy:HH:mm:ss Z"] } mutate{ convert => ["response","integer"] convert => ["bytes","integer"] } } output{ elasticsearch { host => "localhost" } } *
Now, lets start logstash
with this configuration:
$ bin/logstash –f logstash.conf
Logstash will start to run with the defined configuration and keep on indexing all incoming events to the Elasticsearch indexes. You may see an output that is similar to this one on the console:
May 31, 2015 4:04:54 PM org.elasticsearch.node.internal.InternalNode start INFO: [logstash-4004-9716] started Logstash startup completed
Now, you will see your Apache access logs data in Elasticsearch. Logstash was able to parse the input line and break it into different pieces of information, based on the grok patterns, for the Apache access logs. Now, we can easily set up analytics on HTTP response codes, request methods, and different URLs.
At this point, we can open the Elasticsearch Kopf plugin console that we installed in Chapter 1, Introduction to ELK Stack, to verify whether we have some documents indexed already, and we can also query these documents.
If we can see some indexes for Logstash already in Elasticsearch, we have verified that our Logstash configuration worked well.