CHAPTER 21

image

Data Model Capabilities of Excel 2013

In this chapter, I’ll talk about one of the most power capabilities introduced in Excel: the data model.

Let me ask you this: what is the biggest pain faced by dashboard developers? No, it’s not running out of caffeine because the office espresso machine broke. Indeed, among other things, this book assumes you have ample access to caffeine. What, then, is the biggest pain? The answer doesn’t have to do with the nature of your data (although unclean and unkempt data is a pain in its own right). Even with caffeine and good data, another problem arises. That data is all over the place: in different tables, worksheets, or workbooks.

Disconnected data is an age-old problem. Ever since cavemen figured how to draw glyphs on inner walls, people have been asking questions like, “Is this the same symbol drawn on that cave 75 clicks south?” Those were the days before VLOOKUP, so our ancestors ran through forests in the heat to connect those symbols. Those were among the first data connections we humans created.

And yet we might be no better today either. We’ve replaced cave walls with databases, tables, workbooks, comma-separated value (CSV) files, and web sites. Instead of running through forests in the hot sun, we’ve employed a knotted web of VLOOKUPs and other crazy functions to ensure the data is connected. But when our boss asks, “Can you make a report combining sales data in SAP with customer data in our CRM and production data in ERP?” we might as well be running across that prehistoric forest for all the sweat, anxiety, and work required to fulfill this request using lookup functions and formulas.

If you’ve made it this far in the book, then you should know there is another class of problems dashboard developers must deal with. While the previous parts of the book dealt with creating interactive applications, there are applications that must synthesize datasets across a much broader expanse than has been presented in this book thus far. Even before you can create these dashboards, you must bring together various data sources—and those sources could come from many different systems. For these types of problems, VLOOKUP won’t cut it. More to the point, employing many, many lookups could require your spreadsheet to become slow and unruly. Remember, part of good development is avoiding to the extent possible the option to turn off automatic calculation.

Luckily, Microsoft has saved you from slow spreadsheets—and has given you back your Friday afternoons and weekends previously spent connecting disparate data sources by way of VLOOKUP.

Relationship Advice

Don’t worry. I am not turning this book in to relationship advice column. In this section, I’ll talk about data relationships and how they are important to the data model.

Let’s say you have data for sales, customer, and production. With the old method, you would have to use lookup functions to create meaningful connections between these datasets. But what if you could somehow connect everything without any formulas? Then you don’t have to write a heap of VLOOKUP, INDEX, and MATCH formulas. You could figure out questions like, “How many customers in the age group 25 to 40 bought wheat thins that were made in our factory in Nevada in July 2014?” without any formulas whatsoever.

This is where you use the data model feature of Excel 2013. This powerful feature helps you connect various datasets in Excel. Indeed, Excel has become even more powerful. Not only can you wrangle disparate data, but you can avoid the twisted lookup web entirely. In the next sections of this chapter, you will learn how to use the Excel 2013 data model feature.

Preparing Data for Relationships

In this section, I will explain the sample data and the setup process needed for using Excel’s data model capabilities.

Image Note  For this chapter, you’ll be using Chapter21-2013Model.xlsx, which you can find in the source files for this book.

Imagine you are the sales manager at ACME Store and you are looking at sales transaction data for the month of December 2014. It looks like the data shown in Figure 21-1 .

9781430249443_Fig21-01.jpg

Figure 21-1. Sales data

You have several questions.

  • Who bought more? Male or female customers?
  • What is the total revenue? What is the breakdown by product category?
  • What is the sales breakdown by region of customer?
  • How many large products are we selling to self-employed people?

Unfortunately, none of these questions can be answered with the data in Figure 21-1 alone. For example, looking at the first sales record, you can see that customer C0001 purchased one unit of product P025. But you don’t know any other details about C0001 or P025. You need more data. So, you make a few calls and connect to some additional data sources, and now you are looking at two more datasets, as shown in Figures 21-2 and 21-3 .

9781430249443_Fig21-02.jpg

Figure 21-2. Customer data

9781430249443_Fig21-03.jpg

Figure 21-3. Product data

Now that you know what C0001 and P025 mean, you can figure out the answers for your questions—but only after writing a gazillion VLOOKUPs. There must be a better way to set up the relationships between these different datasets. Enter the data model for Excel 2013.

Setting Up the Relationships

Now that you have all the data in one place, let’s go ahead and set up the relationships between them. This process is called data modeling.

Before anything else, you need to ensure the data is in the Excel table format. If your data is not encased within a table, you can follow these steps:

  1. Select the data you want to turn into a table.
  2. Press Ctrl+T.
  3. Make sure the “My table has headers” check box is selected (see Figure 21-4).

    9781430249443_Fig21-04.jpg

    Figure 21-4. The Create Table dialog box

  4. Click OK.
  5. Give your table a name in the Design tab’s Properties group (Figure 21-5). And that’s it.

9781430249443_Fig21-05.jpg

Figure 21-5. Defining the table name on the Design tab’s Properties panel

Once all the different datasets are in tables, you can connect them with each other using the process outlined in the next few steps:

  1. Click Relationships on the Data tab (Figure 21-6).

    9781430249443_Fig21-06.jpg

    Figure 21-6. The Relationships option on the Data tab

  2. This opens the Manage Relationships dialog box. Notice how empty and inviting it is! Let's add a few relationships. Click the New button (Figure 21-7). Where you would use VLOOKUP to look up IDs before, you can now create connections between different columns in your tables reflecting these relationships.

    9781430249443_Fig21-07.jpg

    Figure 21-7. Manage Relationships dialog: Click New to add a new relationship

  3. It doesn’t matter in what order you list your relationships. Excel 2013 is smart enough to figure out the order of relationships (more on this later). So, just select both tables and columns on which the relationship should be defined. In Figure 21-8, you can see the relationship defined between the Customers[Customer ID] and sales[Customer ID] columns.

    9781430249443_Fig21-08.jpg

    Figure 21-8. The Create Relationship dialog box

  4. Let’s add another relationship, this time between the Products and Sales tables.
  5. Because the dataset is fairly simple, those are all the relationships you have. At this stage, the Manage Relationships dialog box looks like Figure 21-9.

9781430249443_Fig21-09.jpg

Figure 21-9. The relationships we’ve just created

At this point, nothing noticeable has taken place on your spreadsheet—at least, not anything visually different in your workbook. This is so unlike real life where new relationships almost always change our appearance, behavior, bank balance, and sleeping patterns.

But something must change, right? Well, your Excel workbook changes. Excel 2013 builds a data model once you create the relationships. This is a capability added to your workbook in the background. You can’t yet see the data model, but it is available in the workbook and waiting to be utilized to answer complex questions with ease. So, let’s deploy it, shall we?

Working with Relationships in the Excel Data Model

So far you have added some data and created relationships between them. In the next part of this chapter, let’s use these relationships in pivot tables so that you can create a quick, insightful, and interactive dashboard.

Exposing Relationships with Pivot Tables

Now that your relationships are ready, let’s use them. Select a cell in any of the tables and insert a pivot table from the Insert tab (see Figure 21-10). Once added, the Create PivotTable dialog box will appear (see Figure 21-11).

9781430249443_Fig21-10.jpg

Figure 21-10. Inserting a pivot table

9781430249443_Fig21-11.jpg

Figure 21-11. The Create PivotTable dialog box

Make sure you have selected the “Add this data to the Data Model” check box at the bottom before clicking OK (see Figure 21-11). This is a key step in the entire process.

Now, let’s take a look in the new pivot table worksheet; look at the field list area. Do you see the difference? In Figure 21-12, you can create a pivot table by synthesizing data from multiple tables. This is the result of the tables you added to your data model.

9781430249443_Fig21-12.jpg

Figure 21-12. Pivot table field list showing multiple tables

Let’s savor the goodness of multiple tables in a pivot table.

Your First Multitable Pivot

Let’s answer the question, “What is the sales quantity breakdown by region of customer?” Select the Area check box in the Customers table and the Quantity check box in the sales table. And, bingo, our answer is ready. Take a look at Figure 21-13.

9781430249443_Fig21-13.jpg

Figure 21-13. Example pivot table from multiple tables

This may look like a simple report, but the mechanism that drives it is truly powerful. Excel combined the data from the Customers and sales tables using the relationship defined just a while ago and derived the regional breakdown for you, all without writing a single VLOOKUP formula.

Let’s enjoy the spoils once more. This time, let’s answer the question, “How many large products are we selling to self-employed people?” Add Customers[Profession] to the row label area, products[Size] to the column label area, and sales[Quantity] to the values area. Your answer is waiting for you in Figure 21-14.

9781430249443_Fig21-14.jpg

Figure 21-14. Understanding how multiple table pivot reports work

Not only do you get the total for “self-employed large product units,” but you get totals for every other possible combination of profession and product size. As you can see, once the relationships are in place, you can use them as per your imagination and get answers to most questions.

Slicing and Dicing with Relationships

Excel 2013’s data model and relationships work beautifully with filtering features such as slicers (a feature introduced in Excel 2010) and timelines (a feature added in Excel 2013). I’ll go through how to use each mechanism in this section.

Adding Slicers to the Pivot Report

Now that you have tied all the tables together, you can create a pivot table from two tables and “slice” it on the third table. For example, let’s find out the quantity breakdown by gender and profession for only the Jelly category of products.

To do this, you need to filter the pivot table by category = Jelly. This can be accomplished by the Report Filters function of the old Excel or the more exciting slicers feature of the new Excel (slicers are supported in Excel 2010 and newer).

To create a slicer on the product category, locate the products table in the PivotTable Fields area, right-click the Category field, and choose Add as Slicer (refer to Figure 21-15).

9781430249443_Fig21-15.jpg

Figure 21-15. Adding a slicer on product category field using PivotTable Fields screen

Once you have a slicer, just click Jelly to show the pivot report for only the Jelly category. Figure 21-16 shows the results of this.

9781430249443_Fig21-16.jpg

Figure 21-16. Quantity breakdown by gender and profession for the Jelly category of products

Image Note  For more information about how to create, use, and customize slicers, refer to http://chandoo.org/wp/2015/06/24/introduction-to-slicers/.

One More Report, Using Timelines

Figure 21-17 shows another example report employing timelines. In it you can provide a quantity breakdown by gender, region, and product category for December 4 to 12, 2014.

9781430249443_Fig21-17.jpg

Figure 21-17. Timelines and pivot reports with multiple tables

To insert a timeline, right-click the Time field in the sales table and choose Add as Timeline.

The relationships feature works seamlessly with multiple slicers, timelines, and pivot table report filters. Try them and see how easily you can answer even the most complex reporting questions. Please refer to Chapter 22 for an example dashboard constructed using multiple slicers.

Rules for Happy Relationships

As you might have guessed, relationships are at the heart of data models. They’re the connections that allow you to bring datasets together. Like all relationships, there are certain rules to follow.

Let’s understand them:

  • One table must have unique values in the relationship column: When you are connecting two tables of data, say A and B on column X, one of the tables must not have duplicate values in that column. In the database world, such a column is referred to as the primary key. Although Excel is not a database, the relationships feature helps you loosely mimic a database in your spreadsheet. So, the standard relationship rule of having one primary key (that is, one column with unique values) applies.
  • No many-to-many relationships: This is a consequence of the previous rule. In the Excel 2013 data model, you can’t have many-to-many relationships. You can add one-to-one or one-to-many relationships, though.
  • All data must be in tables (or connections that are part of the data model): All the data that is part of the data model and relationships should be in Excel tables or connections (that are loading data to tables or the data model automatically). If your data is not in a table, just press Ctrl+T to insert a table around it. Always give the table a meaningful name for easier referencing. Don’t just use the default names such as Table1, Table2, and so on.
  • Pivot tables should be added to the data model: To access the power of relationships, your pivot table must be added to the data model at the time of creation.

The Last Word

The relationship capability in Excel 2013 opens doors for many possibilities. There are many things you can do with a workbook-level data model. You can create really powerful reports and dashboards using relationships, slicers, pivot tables, and charts. In the next chapter, you will examine an interactive dashboard built using these features.

In the future versions of Excel, Microsoft is likely to build more powerful features into the data model and relationships area. We will just have to wait and see.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset