Chapter 11. Data Visualization Strategy

What is the difference between graphic design and data visualization? What distinguishes our actions when we design a website from when we design an executive dashboard? What separates somebody who creates a meaningful icon from another who creates an insightful bar chart?

While both graphic design and data visualization aim to create effective visual communication, data visualization is principally concerned with data analysis. Even though we, who design dashboards and charts, are motivated to create something aesthetically pleasing, we are more passionate about what the data can tell us about our world. This desire to explore our universe via data, and then, communicate our discoveries is the reason that we dedicate our time to learning how best to visualize it.

We start our journey by defining a series of strategies to create and share knowledge using data visualization. In parallel, we propose how we can effectively organize ourselves, our projects, and the applications we develop so that our whole business starts to use insightful visual analysis as quickly as possible. Also, as we survey the entire perspective of our data visualization strategy, we review how we are going to implement it using, arguably, the best data exploration and discovery tool—QlikView.

Let's take a look at the following topics in this chapter:

  • Data exploration, visualization, and discovery
  • Data teams and roles
  • Agile development
  • QlikView Deployment Framework

Data exploration, visualization, and discovery

Data visualization is not something that is done at the end of a long, costly Business Intelligence (BI) project. It is not the cute dashboard that we create to justify the investment in a new data warehouse and several Online Analytical Processing (OLAP) cubes. Data visualization is an integral part of a data exploration process that begins on the first day that we start extracting raw data.

The importance and effectiveness of using data visualization when we are exploring data is highlighted using Anscombe's quartet. Each of the following scatterplots analyzes the correlation between two variables. Correlation can also be explained numerically by means of R-squared. If we were to summarize the correlations of each of the following scatterplots using R-squared, we would discover that the number is be the same for each scatterplot, .816. It is only by visualizing the data in a two-dimensional space do we notice how different each correlation behaves:

Data exploration, visualization, and discovery

Some tools make it cumbersome to visualize data as soon as it is extracted. Most traditional BI solutions have separate tools for each phase of their implementation process. They have one tool that extracts data, another that creates the OLAP cubes, and yet another that constructs visualizations.

QlikView is a tool that allows us to extract, transform, model, and visualize data within the same tool. Since we can visualize data from the moment it is extracted and throughout the rest of the extraction, transformation, and load (ETL) process, we are more likely to discover data anomalies at an earlier stage in the development process. We can also share our discoveries more quickly with business users, and they in turn can give us important feedback before we invest too much time developing analytical applications that don't provide them with real value. Although QlikView is considered a BI software, it stands out amongst its peers due to its extraordinary ability to explore, visualize, and discover data.

In contrast, the implementation of a traditional BI tool first focuses on organizing data into data warehouses and cubes that are based on business requirements created at the beginning of the project. Once we organize the data and distribute the first reports defined by the business requirements, we start, for the first time, to explore the data using data visualization. However, the first time business users see their new reports, the most important discovery that they make is that we've spent a great amount of time and resources developing something that doesn't fulfill their real requirements.

Data exploration, visualization, and discovery

We can blame the business user or the business requirements process for this failure, but nobody can exactly know what they need if they have nothing tangible to start from. In a data discovery tool like QlikView, we can easily create prototypes, or what we later explain as Minimally Viable Products (MVPs), to allow business users to visualize the data within a matter of days. They use the MVP to better describe their needs, discover data inadequacies, and among other things, confirm the business value of the analysis with their executive sponsors. Only after making and sharing these first discoveries do we invest more of our resources into organizing an iteratively more mature data analysis and visualization.

We've established a general data visualization strategy to support our data exploration and discovery. Now, let's review the strategies that we assign to the teams who are tasked with not only exploring the data directly, but also making sure everyone else in the business can perform their own data exploration.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset