MicroStrategy 10 has announced a new connection to Splunk. I suppose that Splunk is not very popular in the world of BI. Most people who have heard about Splunk think that it is just a platform for processing logs. This is both true and false. Splunk was derived from the world of spelunking, because searching for root causes in logs is a kind of spelunking without light, and Splunk solves this problem by indexing machine data from a tremendous number of data sources, starting from applications, hardware, sensors, and so on.
Splunk's goal is making machine data accessible, usable, and valuable for everyone, and turning machine data into business value. It can:
In the BI world, everyone knows what a data warehouse (DWH) is. The following screenshot compares the Splunk approach with the DWH approach:
For Splunk, it doesn't matter what the format of the data is, because it creates schemas at read. For sure, Splunk is more suited for work with unstructured data. We can highlight the following use cases:
All these use cases have one thing in common - a large volume of unstructured data.
Splunk consists of several elements:
Splunk can be horizontally scaled in all layers. The core of Splunk is a MapReduce algorithm. There is a good document about it at the following URL:
https://www.splunk.com/web_assets/pdfs/secure/Splunk_and_MapReduce.pdf
Splunk complements traditional BI and DWH, as shown in the following diagram:
Usually, we use a DWH to analyse our transactional data from structured data sources, but there is lots of unstructured data that is valuable to us. Using Splunk, we can extract value from machine data and blend it with existing DWHs and business data. For example, we can run an online store. In the backend we have the order processing system that is fulfilling our DWH. We know lots about orders, prices, shipping, and so on. Using Splunk, in the same way we can see how our web servers, applications, and mobile apps are performing. And if we see a drop in sales or outages we can simply drill down to the data and find the root cause. It's called Operation Intelligence.
Let's download and install Splunk in order to learn how we can use it as a data source for MicroStrategy:
http://localhost:8000/
.Use the default credentials: admin/changeme
.
As a result, Splunk ingested the data. One of the good things about Splunk that it compresses data up to 40-50%. It is very good for license usage.
After indexing data, we could start to search by clicking on Search Data. On the following screenshot there is a search window with a default query to our new dataset:
Splunk allows us to write queries using Splunk search language. It is a very powerful language. In addition, in Splunk we can build reports and dashboards. It is a kind of powerful analytics platform. Let's create reports in Splunk in order to use them as datasets (tables) in MicroStrategy:
We can create reports using SPL or we can just extract all fields:
index = "web" | table *
We prefer to build, report, and then add them as data sources to the dashboard:
index = web | eval browser=useragent | replace *Firefox* with
Firefox,
*Chrome* with Chrome, *MSIE* with "Internet Explorer",
*Version*Safari*
with Safari, *Opera* with Opera in browser | top limit=5 useother=t
browser
index = web | chart count AS views, count(eval(action="purchase")) AS purchases by categoryId | rename views as "Views", purchases AS "Purchases", categoryId AS "Category"
index = web action=purchase | transaction clientip maxspan=10m |
chart count by duration span=log2
When we create a report, it asks about security permission:
We should give permissions to everyone so that MicroStrategy doesn't have any problems connecting to Splunk.
MicroStrategy uses the Splunk ODBC driver to connect Splunk. Let's download and install it:
SplunkODBC64
driver. During installation, we can input data for the ODBC driver:
Now we are ready to build reports using MicroStrategy Desktop and Splunk. Let's do it:
Now we can build a dashboard using data from Splunk by dragging and dropping attributes and metrics: