As we've explored in the previous chapters, Elasticsearch is a powerful and versatile tool to store, query, and aggregate data. The only way to interact with Elasticsearch is by using its feature-rich set of REST APIs. This includes anything from creating and managing indices and ingesting documents to running queries or aggregating large datasets. We've also looked at how tools such as Beats and Logstash are great at collecting data from various sources and loading it into Elasticsearch clusters for end user consumption. This is where Kibana plays a vital role in the Elastic Stack.
This chapter explores the role that Kibana plays in the Elastic Stack in allowing users to visualize, interact with, and build use cases on top of data in Elasticsearch. Kibana is also the primary way in which users can consume out-of-the-box solutions, such as Enterprise Search, Security, and Observability, as well as manage and configure the backing Elasticsearch cluster.
In this chapter, we will specifically focus on the following:
This chapter walks you through the various features of Kibana when it comes to building and consuming use cases from your data. You will need access to an instance of Kibana connected to an Elasticsearch deployment to follow along. If you don't already have a deployment configured, follow the instructions provided in Chapter 2, Installing and Running the Elastic Stack.
The code for this chapter can be found in the GitHub repository for the book:
https://github.com/PacktPublishing/Getting-Started-with-Elastic-Stack-8.0/tree/main/Chapter8
Navigate to Chapter8/trips-dataset in the code repository for this book and follow the instructions to load a dataset containing flight travel logs for a single passenger over a period of time:
./load.sh
logstash-8.0.0/bin/logstash -f logstash-trips.conf < trips.csv
GET trips/_search
Move on to the next section once you've successfully loaded the dataset.
Collecting and ingesting data into your Elasticsearch cluster is only half the challenge when it comes to extracting insights and building useful outcomes from your datasets. Having access to fully featured and well-documented REST APIs on the Elasticsearch level is super useful, especially when your applications and systems programmatically consume responses from queries and aggregations, among other things. However, end users would much rather use an intuitive visual interface to build visualizations to understand trends in business data, diagnose bugs in their applications, and hunt for threats in their environment.
Kibana is the primary user interface when it comes to interacting with Elasticsearch clusters and, to some extent, components such as Logstash and Beats.
Given Kibana is primarily used to interact with data on Elasticsearch, an Elasticsearch cluster must be available for Kibana to run. The backing Elasticsearch cluster is used to achieve persistence of state, settings, and other data; the Kibana instance in itself is stateless. Kibana instances are also not clustered components; they do not interact with other Kibana instances in order to share tasks and workloads.
Multiple Kibana instances can be configured to work with the same Elasticsearch cluster. This is especially useful to achieve high availability as well as scalability at the Kibana level.
Next, we will look at some of the solutions offered by Kibana.
Kibana is the primary way in which users of the Elastic Stack can build and consume solutions with their data. There are three main focus areas for out-of-the-box solutions on the stack. Users can also leverage the generic data analysis, visualization, modeling, and graphing capabilities of Kibana along with the general-purpose Extract, Transform, and Load (ETL), search, and aggregation capabilities from the rest of the stack to build solutions in any other area or domain as required.
The Observability solution in Kibana allows developers and Site Reliability Engineers (SREs) to centralize logs, metrics, and application performance metrics in one place from across their environment. The solution is broken down into the following apps on Kibana and can be accessed from the navigation menu:
The Security solution on Kibana allows security analysts and threat hunters to understand, contextualize, and respond to security threats in an environment. The solution provides both Security Information and Event Management (SIEM) and Endpoint Detection and Response (EDR) capabilities to users. The Security app consists of the following capabilities:
The Enterprise Search solution allows developers and content managers to create seamless search experiences for apps, websites, or the workplace using the Elastic Stack. The solution consists of the following apps:
All out-of-the-box and bespoke/user-created solutions on Kibana can leverage the following analytics capabilities in their use cases:
A fundamental aspect of starting to work with a dataset on Kibana is configuring the data view for the data. A Kibana data view determines what underlying Elasticsearch indices will be addressed in a given query, dashboard, alert, or machine learning job configuration. Data views also cache some metadata for underlying Elasticsearch indices, including the field names and data types (the schema) in a given group of indices. This cached data is used in the Kibana interface when creating and working with visualizations.
In the case of time series data, data view can configure the name of the field containing the timestamp in a given index. This allows Kibana to narrow down your queries, dashboards, and so on to the appropriate time range on the underlying indices, allowing for fast and efficient results. The universal date and time picker at the top right of the screen allows granular control of time ranges. The time picker will not be available if a time field is not configured for a data view.
Data view can also specify how fields should be formatted and rendered on visualizations. For example, a source.bytes integer field can be represented by bytes to automatically format values in human-readable units such as MB or GB.
To get started with our use cases, follow these steps to create a data view for the trips dataset:
Your data view should look as follows. All available fields and the corresponding data types should be displayed:
Note
Data views were referred to as "index patterns" on older versions of Kibana. Data views may be referred to as index patterns in some parts of this book as well as online references or documentation.
You should see the data appear as follows in the Discover app. Remember to increase the time range you're searching for using the time range filter in the top right to see all data:
You can also map runtime fields as part of your data view in Kibana. Unlike a regular field in an index, a runtime field is computed by Elasticsearch at search time. This eliminates the time-consuming process of changing log formats on source systems or making changes to ETL configurations to build use cases.
The trips dataset contains a field for StartAirport and EndAirport for each trip. It may, however, be useful to have a field called Route to represent the start and end airports in one value. Given this field doesn't exist in our original dataset, follow these instructions to create a runtime field to make this field available:
emit(doc['StartAirport'].value + ">" + doc['EndAirport'].value);
Now that you've successfully configured the trips data view, let's put together some visualizations.
Dashboards in Kibana are the primary tool to visualize datasets in order to understand what the data means. Users generally spend a significant chunk of their time on Kibana working with dashboards; well-designed dashboards can efficiently communicate important metrics, trends in activity, and any potential issues to look out for.
The Nginx dashboard shown in the following screenshot (available out of the box) visualizes source geo-locations, web server response codes over time, common error types, and top resources accessed on the web server. An engineer eyeballing this data can spot something out of the ordinary. If, for example, HTTP 5xx response codes suddenly start increasing for a given resource on the server, the engineer can quickly narrow down potential issues and proceed to fix them before end users are impacted:
Dashboards are designed to work interactively. Most visualizations are clickable and can be used to select and filter on values during analysis, with all components on the dashboard updating in real time.
The universal search bar and time-range filters on the top of the screen can be used to further filter data as required. Filters applied on the top can be pinned across applications in Kibana. A user, for example, may pin a given hostname in the Nginx logs dashboard and pivot to the System overview dashboard for a host-specific view; the pinned filters travel across dashboards to automatically present relevant information.
The following instructions will help you create a new dashboard for the trips dataset:
The following is an example of a more complex dashboard looking at various aspects of routes on the trip, including the proportion of routes traveled, total distance traveled per route, average costs, and aircraft types servicing the routes. The types of dashboard elements used include the following:
Next, we will look at using Canvas to create presentations powered by data from Elasticsearch.
Dashboards are a great way to visualize and consume data from Elasticsearch. Given their form factor, dashboards are interactive and can easily support analyst workflows in interrogating and pivoting data.
Dashboards, however, are not ideal when it comes to more granular control of how information is presented to a user. Canvas allows users to control the visual appearance of their data a lot more granularly, making it ideal for use in presenting key insights derived from data. Unlike normal presentations though, Canvas can be powered by live datasets on Elasticsearch in real time.
The following Canvas presentation presents some key insights from the trips dataset. A bunch of key stats, such as total trips, the number of countries, airlines, and total distance traveled, is rendered on the right side. The pie graph in the following Canvas presentation displays the proportion of business and economy class trips while the bubble chart shows the top five cheapest trip routes in the dataset.
You can add images and align visual elements as needed to create aesthetically appealing presentations.
Canvas supports multiple slides in the one Canvas workpad. Follow these instructions to create your first Canvas presentation:
SELECT count(*) as count FROM "trips"
You've successfully created your first element in Canvas. Iterate to add all elements in the Travel stats slide, as shown in Figure 8.10.
The following is the second page in the same canvas, visualizing the total distance traveled in the trips in proportion to the distance between the Earth and its moon. As shown, Canvas supports graphical backgrounds and images to emphasize the message in the data. A bar chart and a progress wheel are also used in this slide:
Next, we will look at using Kibana Maps for geospatial data.
Elasticsearch comes with great support for geospatial data out of the box. Geo-point fields can hold a single geographic location (latitude/longitude pair) while Geo-shape fields support the encoding of arbitrary geoshapes (such as lines, squares, polygons, and so on). When searching for data on Elasticsearch, users can also leverage a range of geo queries, such as geo_distance (which finds docs containing a geo-point within a given distance from a specified geo_point) and geo_bounding_box (which finds docs with geo-points falling inside a specified geographical boundary). Kibana Maps is the visual interface for the geospatial capabilities on Elasticsearch.
Geospatial data is useful (and rather common) in several use cases. For example, logs containing public addresses will often contain (or can be enriched with) geo-location information for the corresponding host.
Analysts can use this context to understand whether connections to certain geographies are expected or application performance differs as users located further away from compute infrastructure may naturally experience degraded performance on network-bound applications.
The data is also useful for extracting insights from data. For example, grouping e-commerce purchases by suburbs in a city helps analysts understand their customer demographic and purchasing preferences. This information is useful for stocking decisions, marketing recommendations, and product development cycles.
Maps on Kibana come with base layer maps, which are loaded from the Elastic Maps Service (EMS). EMS hosts tile and vector layers for maps (for various zoom levels). Base maps include administrative boundary maps for various countries, as well as road maps for the planet.
Important Note
On non-internet-connected Kibana instances, EMS can be hosted locally provided a valid Elastic license/subscription is configured on your Elasticsearch cluster. Alternatively, users may choose to use a third-party mapping service, such as OpenStreetMap or Web Map Service. EMS is free to use for all internet-connected Elastic Kibana instances.
Follow these instructions to create your first map on Kibana:
ServiceClass: "Economy"
Your map should look as follows:
The map in the preceding screenshot shows all departure airports in the dataset, as well as paths representing the routes traveled. The intensity of the circle markers and thickness of the path represent the frequency of trips for the given route.
The following examples show maps visualizing the trips dataset to understand common departure/arrival locations, route frequencies, countries/cities of travel, and price analysis by the geography of travel.
Maps can also be embedded in dashboards on Kibana and work seamlessly alongside all your other visualizations.
This Trips overview map is similar to the example we just created but uses a heat map to represent trip frequency. The more intense clusters show a larger trip frequency from the airport:
Trips by departure country uses a different base layer map than the first two examples. The map uses the World countries base map (as the granularity of analysis is at the country level). The Departure countries layer represents all countries with a departure event. The layer performs a terms join on the base map World Countries layer with the departure country field in the trips index. The last layer selects the top airline (by frequency) per departure airport.
The last example looks at Trip cost by destination. On top of a default road map base layer, the map uses a cluster/grid layer to visualize the average activity cost per destination city in the trips data. As expected, the average trip to Perth, Western Australia, has the highest average cost in the country.
Next, we will look at using Kibana alerting in response to changes in incoming data.
So far in the chapter, we've looked at different ways in which users can interact with various types of data in real time. Analysts can easily explore and interrogate data and find events of interest and the consequences they may have on their use case.
Events of interest once discovered through analysis can happen multiple times in a system. Interactive analysis workflows involving a human do not necessarily scale in these cases, and there is a need to automate the detection of these events. This is where alerting plays an important role.
Kibana alerting is an integrated platform feature across all solutions in Kibana. Security analysts, for example, can use alerting to apply threat detection logic and the appropriate response workflows to mitigate potential issues. Engineering teams may use alerts to find precursors to a potential outage and alert the on-call site reliability engineer to take necessary action. We will explore solution-specific alerting workflows in later chapters of the book.
Alerting can also be applied generally to non-solution-oriented workflows in Kibana. We explore some core alerting concepts in the following sections and dive into some examples with the trips dataset.
Alerts in Kibana are defined by a rule. A rule determines the logic behind an alert (condition), the interval at which the condition should be checked (schedule), and the response actions to be executed if the detection logic returns any resulting data.
Successful matches/detections returned by a rule are stored as a signal or alert (depending on the solution you're using). Analysts can work off a prioritized or triaged list of alerts (based on severity or importance) in their workflows.
The following diagram illustrates the core concepts behind alerting:
The rule is defined as follows:
Now that we understand some of the core alerting concepts, let's create some for the trips dataset.
Kibana supports a range of rule types for alerting. Rules are categorized as generic and solution-specific, allowing for a rich solution-specific context for security and observability use cases.
A list of all supported rule types can be found here:
https://www.elastic.co/guide/en/kibana/8.0/rule-types.html
First, we will look at creating a simple threshold-based alert to match a given field in the data. The rule looks for the number of trips flown on a non-preferred airline and alerts when the number of trips in the last year exceeds five. Follow these instructions to define the alert:
{
"query":{
"bool": {
"must_not": [
{
"term": {
"AirlineCode": {
"value": "QF"
}
}
}
]
}
}
}
{
"alert_type": "non-preferred-airline",
"alert_message": "The number of trips on non-preferred airlines has exceeded {{params.threshold}}",
"rule_id": "{{rule.id}}",
"rule_name": "{{rule.name}}"
}
POST trips/_doc
{
"StartDate": "5/12/20",
"AirlineCode": "VA",
"StartAirport": "SYD",
"EndAirport": "MEL"
}
After a few minutes (depending on your rule schedule), you should see the following alert in the Rules UI in Kibana:
If you check the trip-alerts index, you should see the document generated by the alert:
Next, we will look at creating a rule to alert on geospatial data.
The rule tracks the location of the StartAirport field in the trips data and alerts if StartAirport falls outside of the boundaries of Australia, stored as a GeoShape field in an Elasticsearch index.
The following figure shows the GeoShape field for the boundary around Australia:
Follow these instructions to create this alert:
PUT country-geoshapes
{
"mappings": {
"properties": {
"country": {
"type": "keyword"
},
"shape": {
"type": "geo_shape"
}
}
}
}
POST country-geoshapes/_doc/
{
"country": "Australia",
"shape": {
"coordinates": [
[
[-247.460306, -10.2091625],
[-205.716033, -10.6577981],
[-205.6937224, -43.9986958],
[-247.4925917, -44.17606],
[-247.460306, -10.2091625]
]
],
"type": "Polygon"
}
}
PUT custom-trips
{
"mappings": {
"properties": {
"StartDate": {
"type": "date"
},
"TripID": {
"type": "keyword"
},
"CurrentGeoLocation": {
"type": "geo_point"
}
}
}
}
{
"alert_type": "trip-outside-australia",
"alert_message": "A trip travelling outside the geo boundary for Australia was found",
"rule_id": "{{rule.id}}",
"rule_name": "{{rule.name}}"
}
Index the first document:
# Trip with location inside the boundary for Australia
POST custom-trips/_doc
{
"StartDate": "2021-06-20T09:24:55.430Z",
"TripID": "test-alert",
"CurrentGeoLocation" : "-37.673298,144.843013"
}
Index the second document:
# Trip with location outside the boundary for Australia
POST custom-trips/_doc
{
"StartDate": "2021-06-20T09:24:55.430Z",
"TripID": "test-alert",
"CurrentGeoLocation" : "-36.1248652,148.4837257"
}
In a few minutes, you should see an alert as follows:
You should also see the corresponding alert document in your trip-alerts index:
The geo-containment alert should now be successfully configured. Now that we've looked at two different examples of alerts on Kibana, let's summarize the contents of this chapter.
In this chapter, we looked at how you can explore, analyze, and consume data on Elasticsearch using Kibana.
We started with learning how dashboards can be used to extract insights from large datasets. Then, we looked at how image-rich Canvas presentations, backed by live data can be a powerful visualization tool. Next, we looked at how Kibana Maps can help when working with geospatial datasets. We finished by exploring the use of Kibana alerting and actions to respond to changes in datasets.
The next chapter explores the management and continuous onboarding of data using Elastic Agent and Fleet.