Data Model Capabilities of Excel 2013
In this chapter, I’ll talk about one of the most power capabilities introduced in Excel: the data model.
Let me ask you this: what is the biggest pain faced by dashboard developers? No, it’s not running out of caffeine because the office espresso machine broke. Indeed, among other things, this book assumes you have ample access to caffeine. What, then, is the biggest pain? The answer doesn’t have to do with the nature of your data (although unclean and unkempt data is a pain in its own right). Even with caffeine and good data, another problem arises. That data is all over the place: in different tables, worksheets, or workbooks.
Disconnected data is an age-old problem. Ever since cavemen figured how to draw glyphs on inner walls, people have been asking questions like, “Is this the same symbol drawn on that cave 75 clicks south?” Those were the days before VLOOKUP, so our ancestors ran through forests in the heat to connect those symbols. Those were among the first data connections we humans created.
And yet we might be no better today either. We’ve replaced cave walls with databases, tables, workbooks, comma-separated value (CSV) files, and web sites. Instead of running through forests in the hot sun, we’ve employed a knotted web of VLOOKUPs and other crazy functions to ensure the data is connected. But when our boss asks, “Can you make a report combining sales data in SAP with customer data in our CRM and production data in ERP?” we might as well be running across that prehistoric forest for all the sweat, anxiety, and work required to fulfill this request using lookup functions and formulas.
If you’ve made it this far in the book, then you should know there is another class of problems dashboard developers must deal with. While the previous parts of the book dealt with creating interactive applications, there are applications that must synthesize datasets across a much broader expanse than has been presented in this book thus far. Even before you can create these dashboards, you must bring together various data sources—and those sources could come from many different systems. For these types of problems, VLOOKUP won’t cut it. More to the point, employing many, many lookups could require your spreadsheet to become slow and unruly. Remember, part of good development is avoiding to the extent possible the option to turn off automatic calculation.
Luckily, Microsoft has saved you from slow spreadsheets—and has given you back your Friday afternoons and weekends previously spent connecting disparate data sources by way of VLOOKUP.
Relationship Advice
Don’t worry. I am not turning this book in to relationship advice column. In this section, I’ll talk about data relationships and how they are important to the data model.
Let’s say you have data for sales, customer, and production. With the old method, you would have to use lookup functions to create meaningful connections between these datasets. But what if you could somehow connect everything without any formulas? Then you don’t have to write a heap of VLOOKUP, INDEX, and MATCH formulas. You could figure out questions like, “How many customers in the age group 25 to 40 bought wheat thins that were made in our factory in Nevada in July 2014?” without any formulas whatsoever.
This is where you use the data model feature of Excel 2013. This powerful feature helps you connect various datasets in Excel. Indeed, Excel has become even more powerful. Not only can you wrangle disparate data, but you can avoid the twisted lookup web entirely. In the next sections of this chapter, you will learn how to use the Excel 2013 data model feature.
Preparing Data for Relationships
In this section, I will explain the sample data and the setup process needed for using Excel’s data model capabilities.
Note For this chapter, you’ll be using Chapter21-2013Model.xlsx, which you can find in the source files for this book.
Imagine you are the sales manager at ACME Store and you are looking at sales transaction data for the month of December 2014. It looks like the data shown in Figure 21-1 .
Figure 21-1. Sales data
You have several questions.
Unfortunately, none of these questions can be answered with the data in Figure 21-1 alone. For example, looking at the first sales record, you can see that customer C0001 purchased one unit of product P025. But you don’t know any other details about C0001 or P025. You need more data. So, you make a few calls and connect to some additional data sources, and now you are looking at two more datasets, as shown in Figures 21-2 and 21-3 .
Figure 21-2. Customer data
Figure 21-3. Product data
Now that you know what C0001 and P025 mean, you can figure out the answers for your questions—but only after writing a gazillion VLOOKUPs. There must be a better way to set up the relationships between these different datasets. Enter the data model for Excel 2013.
Setting Up the Relationships
Now that you have all the data in one place, let’s go ahead and set up the relationships between them. This process is called data modeling.
Before anything else, you need to ensure the data is in the Excel table format. If your data is not encased within a table, you can follow these steps:
Figure 21-4. The Create Table dialog box
Figure 21-5. Defining the table name on the Design tab’s Properties panel
Once all the different datasets are in tables, you can connect them with each other using the process outlined in the next few steps:
Figure 21-6. The Relationships option on the Data tab
Figure 21-7. Manage Relationships dialog: Click New to add a new relationship
Figure 21-8. The Create Relationship dialog box
Figure 21-9. The relationships we’ve just created
At this point, nothing noticeable has taken place on your spreadsheet—at least, not anything visually different in your workbook. This is so unlike real life where new relationships almost always change our appearance, behavior, bank balance, and sleeping patterns.
But something must change, right? Well, your Excel workbook changes. Excel 2013 builds a data model once you create the relationships. This is a capability added to your workbook in the background. You can’t yet see the data model, but it is available in the workbook and waiting to be utilized to answer complex questions with ease. So, let’s deploy it, shall we?
Working with Relationships in the Excel Data Model
So far you have added some data and created relationships between them. In the next part of this chapter, let’s use these relationships in pivot tables so that you can create a quick, insightful, and interactive dashboard.
Exposing Relationships with Pivot Tables
Now that your relationships are ready, let’s use them. Select a cell in any of the tables and insert a pivot table from the Insert tab (see Figure 21-10). Once added, the Create PivotTable dialog box will appear (see Figure 21-11).
Figure 21-10. Inserting a pivot table
Figure 21-11. The Create PivotTable dialog box
Make sure you have selected the “Add this data to the Data Model” check box at the bottom before clicking OK (see Figure 21-11). This is a key step in the entire process.
Now, let’s take a look in the new pivot table worksheet; look at the field list area. Do you see the difference? In Figure 21-12, you can create a pivot table by synthesizing data from multiple tables. This is the result of the tables you added to your data model.
Figure 21-12. Pivot table field list showing multiple tables
Let’s savor the goodness of multiple tables in a pivot table.
Your First Multitable Pivot
Let’s answer the question, “What is the sales quantity breakdown by region of customer?” Select the Area check box in the Customers table and the Quantity check box in the sales table. And, bingo, our answer is ready. Take a look at Figure 21-13.
Figure 21-13. Example pivot table from multiple tables
This may look like a simple report, but the mechanism that drives it is truly powerful. Excel combined the data from the Customers and sales tables using the relationship defined just a while ago and derived the regional breakdown for you, all without writing a single VLOOKUP formula.
Let’s enjoy the spoils once more. This time, let’s answer the question, “How many large products are we selling to self-employed people?” Add Customers[Profession] to the row label area, products[Size] to the column label area, and sales[Quantity] to the values area. Your answer is waiting for you in Figure 21-14.
Figure 21-14. Understanding how multiple table pivot reports work
Not only do you get the total for “self-employed large product units,” but you get totals for every other possible combination of profession and product size. As you can see, once the relationships are in place, you can use them as per your imagination and get answers to most questions.
Slicing and Dicing with Relationships
Excel 2013’s data model and relationships work beautifully with filtering features such as slicers (a feature introduced in Excel 2010) and timelines (a feature added in Excel 2013). I’ll go through how to use each mechanism in this section.
Adding Slicers to the Pivot Report
Now that you have tied all the tables together, you can create a pivot table from two tables and “slice” it on the third table. For example, let’s find out the quantity breakdown by gender and profession for only the Jelly category of products.
To do this, you need to filter the pivot table by category = Jelly. This can be accomplished by the Report Filters function of the old Excel or the more exciting slicers feature of the new Excel (slicers are supported in Excel 2010 and newer).
To create a slicer on the product category, locate the products table in the PivotTable Fields area, right-click the Category field, and choose Add as Slicer (refer to Figure 21-15).
Figure 21-15. Adding a slicer on product category field using PivotTable Fields screen
Once you have a slicer, just click Jelly to show the pivot report for only the Jelly category. Figure 21-16 shows the results of this.
Figure 21-16. Quantity breakdown by gender and profession for the Jelly category of products
Note For more information about how to create, use, and customize slicers, refer to http://chandoo.org/wp/2015/06/24/introduction-to-slicers/.
One More Report, Using Timelines
Figure 21-17 shows another example report employing timelines. In it you can provide a quantity breakdown by gender, region, and product category for December 4 to 12, 2014.
Figure 21-17. Timelines and pivot reports with multiple tables
To insert a timeline, right-click the Time field in the sales table and choose Add as Timeline.
The relationships feature works seamlessly with multiple slicers, timelines, and pivot table report filters. Try them and see how easily you can answer even the most complex reporting questions. Please refer to Chapter 22 for an example dashboard constructed using multiple slicers.
As you might have guessed, relationships are at the heart of data models. They’re the connections that allow you to bring datasets together. Like all relationships, there are certain rules to follow.
Let’s understand them:
The Last Word
The relationship capability in Excel 2013 opens doors for many possibilities. There are many things you can do with a workbook-level data model. You can create really powerful reports and dashboards using relationships, slicers, pivot tables, and charts. In the next chapter, you will examine an interactive dashboard built using these features.
In the future versions of Excel, Microsoft is likely to build more powerful features into the data model and relationships area. We will just have to wait and see.