Reducing time to insight

T. Carnahan    Microsoft, Redmond, WA, United States

Abstract

Insight production, preferably with data (I jest), is a big part of my role at Microsoft. And if you’re reading this chapter, I’ll assert you are also an Insight Producer, or well on your way to becoming one.

Since insight is the product, let’s start with a simple definition. The classical definition of insight is new understanding or human comprehension. That works well for the result of a science experiment and answering a research question. Industry, with its primordial goal of growing results, focuses my definition to new actionable comprehension for improved results. This constrains the questions of insight search to subjects and behaviors known to contribute business value.

Keywords

Insight; Time to insight (TTI); Insight value chain; Key performance indicator (KPI)

What is Insight Anyway?

Insight production, preferably with data (I jest), is a big part of my role at Microsoft. And if you’re reading this chapter, I’ll assert you are also an Insight Producer, or well on your way to becoming one.

Since insight is the product, let’s start with a simple definition. The classical definition of insight is new understanding or human comprehension. That works well for the result of a science experiment and answering a research question. Industry, with its primordial goal of growing results, focuses my definition to new actionable comprehension for improved results. This constrains the questions of insight search to subjects and behaviors known to contribute business value.

Let’s look at a more concrete example. As a purveyor and student of software engineering insights, I know large software organizations value code velocity, sometimes called speed or productivity and quality as a control. Organizations also value engineer satisfaction, if you’re lucky. I seek to understand existing behaviors and their contribution to those areas. I am continuously hunting for new or different behaviors that can dramatically improve those results.

Time to Insight

There is immeasurable distance between late and too late.

Og Mandino (American Essayist and Psychologist)

An insight’s value can decay rapidly, and eventually expire altogether as unactionable. History is full of war stories where intelligence arrived in time to change the outcome. Stock markets quickly sway and even crash based on insights. But an alert that my service is down, an hour after receiving a phone call from my best customer’s lawyer to handle damages from the outage is both too late, and uninsightful! The key point here is that speeding up data processing and information production creates new opportunities for insight that was previously blocked, by giving time to take action on the new understanding. This also adds a competitive edge, in almost any human endeavor.

Let’s start with a simple definition of time to insight (TTI): Elapsed calendar time between a question being asked, and the answer understood, well enough for action to be taken.

This seems pretty straightforward until you start peeling it apart. There are a couple big variables left:

1. Who is the customer? Who can ask the questions? For someone in Insight Production, this is often the person cutting the checks, and those responsible for making decisions and taking action. This matters immensely because they will need to be able to interface with you effectively to ask the questions, and comprehend the answers into insights to take appropriate action. If they don’t know how to read a dendrogram, I recommend teaching them or giving them an alternative visualization.

2. Which questions will deliver the most lifetime value? With constrained resources to invest, it’s worth prioritizing investment. A simple proxy I frequently use for this is how many times the question will be asked. Generally if you optimize for questions that have some value and are asked the most often, the results are good. It is more difficult to optimize for the highest value questions, which often only need to be answered once.

A great TTI experience to showcase is searching the internet with Google or Bing. Much of the world's population has been trained to ask their public domain questions in the little search box using keywords and seminatural language. And they all expect a fast, subsecond response! They may not get exactly the answer or insight they were looking for on their first try. But the cost of more attempts is so low that curiosity overcomes the friction and the result is an interactive Question and Answer session that often ends with some new or new-again knowledge (how many of you also forget the name of the actor in that movie and look it up on your phone?). If only my search engine was omniscient and I could ask it anything! An additional advantage of this search model is the feedback loop of what questions users are asking. Volumes and trends of what questions the world is asking turn out to be very valuable.

TTI is getting traction from Forbes and other similar companies [1,2], as the leading key performance indicator (KPI) for insight production and analytics teams. And I agree! In fact, I devote at least 50% of the insight production team I run at Microsoft to reducing the TTI. These engineers are experts at instrumenting, collecting, modelling, curating, indexing, and storing data for fast query into formats, schemas, indexes, and query engines that are the most familiar and accessible to those asking the questions. They can include a mixture of natural language and full-text search, CSV, Excel, SQL, OLAP cubes, and even visualizations in dashboards and scorecards. The goal is to deliver the customer’s desired insight quickly, and in the most effective medium.

The Insight Value Chain

For industry and academia alike, the capability to produce insights fast is often a differentiated, competitive advantage. One pro tip is to think of insight production as a value chain [3,4]. Identify and evaluate the primary activities between a question asked, and an insightful response given. These often include, but are not limited to: collect data, analyze, transform, store, query, and visualize. And sometimes it also requires a person to help interpret the data for the customer, for them to have the ideal actionable insight.

u43-01-9780128042069

Where is the time spent? Is data collection still manual? Perhaps there are expensive computations and transformations that add hours of lag. Evaluate each activity and identify the best systemic ways to reduce TTI. In other words, perform a value chain analysis [5].

What To Do

Here are a few tips from the trenches of insight production. This first set of tips helps you start with the end, or moment of insight, in mind. This is often a specific visualization, report, or scorecard with KPIs. It could also be a few fast ad-hoc queries to a well-modeled store to discover and validate the next big insight. But the result is valuable new understanding that is delivered in time to take that next positive action.

 Visualization matters! How long do people need to stare at the screen to have a “eureka” moment? A good example from industry is the simple key performance indicator, or KPI. There is little to no cognitive friction in a well-designed KPI. It’s red or green and a simple number. Learning to read candlestick charts takes a bit longer, but can still be the simplest way to communicate more complicated distributions and relationships.

 Assist human comprehension with the simplest models and easy access to experts and docs. The simplest model to explain typically yields the fastest comprehension. Simple statistical models are sometimes preferred over neural networks for this reason. And an insight may well require a subject matter expert to interpret and explain it, in addition to the visualization.

 The last mile on fast insights. Sometimes it is critical to get humans reacting to new information immediately. Consider firemen at the fire station responding to a call and alarms blaring and lights flashing. Sometimes a web page isn’t going to reach recipients fast enough and you need to push the insight via information radiators, pagers, text messages, or even alarm bells to ensure the information is received with enough time to react.

Here are a few more technical tips.

 Automate everything! Data collection, cleansing, and curation. This might include saving or caching intermediary results that can take too long to re-compute. No manual step or computation is too small to automate when you need to scale.

 Use the right data store or index. Know when to use in-memory caches, relational stores, multidimensional stores, text search stores, graph stores, stream processing, or map reduce systems. Each is best for certain types of questions at different scales. At smaller scales, you might get away with one store. At cloud scales, you start specializing to get the incremental load and query latencies down to what you need for certain classes of questions. Fight the hype that one new store will rule them all.

 Push compute and quality upstream. Connect people who understand the insight desired with the people instrumenting the process. Well-designed instrumentation and collection coupled with standard or thoughtful data formats and schemas often yield data more fit for insights and require less transforms, joins, cleansing, and sampling throughout the pipeline. That produces speed and lower costs! Also, fewer distinct collection systems tend to reduce the join complications later by coalescing data formats, schemas and stores earlier.

 Shoot to deliver self-service [6,7]. Self-service means the person with the questions has access to all layers of data (raw to curated) and unfettered access to the instrumentation and transformation code. This builds transparency and trust in the insight, but it can also help the smart humans who can help themselves. Of course do this within the safety of your data policies.

A Warning on Waste

Simply put, large data platform and analytical systems are usually expensive, in many dimensions: calendar, hardware, software, and data experts to build and run them. I have witnessed numerous projects invest too much and too early, to answer an interesting question exactly once. The advice I give is to always answer the first few questions, usually generated by a GQM (goal, question, metric) [8] exercise manually, as cheaply as possible. And then let usage, or the questions that keep getting asked repeatedly, guide and justify funding investments to automate and lower TTI on the questions your audience actually has and will act upon.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset