When analyzing data with Power Pivot, you'll often find that you need to expand your analysis to include data based on calculations that are not in your original data set. Power Pivot has a robust set of functions called data analysis expressions, or DAX functions, which allow you to perform mathematical operations, recursive calculations, data lookups, and much more.
This chapter introduces you to DAX functions, and it provides the ground rules for building your own calculations in Power Pivot data models.
Most of the examples in this chapter are available on this book's website at |
Calculated columns are columns that you create to enhance a Power Pivot table with your own formulas. Calculated columns are entered directly in the Power Pivot window, becoming part of the source data you use to feed your PivotTable. Calculated columns work at the row level; that is, the formulas that you create in a calculated column perform their operations based on the data in each individual row. For example, imagine you have a Revenue column and a Cost column in your Power Pivot table. You could create a new column that calculated [Revenue]
minus [Cost]
. This calculation is simple and is valid for each row in the data set.
Calculated measures are used to perform more complex calculations that work on an aggregation of data. These calculations are applied directly to your PivotTable, creating a sort of virtual column that can't be seen in the Power Pivot window. Calculated measures are needed when you need to calculate based on an aggregated grouping of rows, such as the sum of [Year2]
minus the sum of [Year1]
.
Creating a calculated column works very much like building formulas in an Excel table. Let's continue the walk-through with some sample data; follow these steps to create a calculated column:
=[UnitPrice]*[Quantity]
Each calculated column that you create will automatically be available in any PivotTable connected to the Power Pivot data model. You don't have to take any action to get your calculated columns into the PivotTable. Figure 37.2 illustrates the Total Revenue calculated column in the PivotTable Fields list. These calculated columns can be used just as you would any other field in your PivotTable.
See Chapter 35, “Introducing Power Pivot,” for a refresher on how to create a PivotTable from Power Pivot. |
You'll often need to change the formatting of your Power Pivot columns to match the data within them appropriately. For example, you might want to show numbers as currency, remove decimal places, or display dates in a certain way.
You are by no means limited to formatting just calculated columns. These steps can be used to format any column that you see in your Power Pivot window:
Like all calculations in Excel, Power Pivot allows you to reference a calculated column as a variable in another calculated column. Figure 37.4 illustrates this with a new calculated column called Gross Margin. Notice in the Formula bar that the calculation uses our previously created [Total Revenue] calculated column.
Because calculated columns can reference each other, you can imagine creating columns simply as helper columns for other calculations. You may not want your end users to see these columns in your client tools. In this context, the term client tools refers to PivotTables, Power View dashboards, and Power Map.
Similar to hiding columns on an Excel worksheet, Power Pivot allows you to hide any column (it doesn't have to be a calculated column). To hide columns, simply select the columns you want hidden, right-click the selection, and then choose the Hide from Client Tools option (illustrated in Figure 37.5).
When a column is hidden, it will not show as an available selection in your PivotTable Fields list. However, if the column you are hiding is already part of the PivotTable report, meaning that you already dragged it onto the PivotTable, hiding the column in the Power Pivot window will not automatically remove it from the report. Hiding merely affects the ability to see the column in the PivotTable Fields list.
You'll note in Figure 37.6 that Power Pivot recolors columns based on their attributes. Hidden columns are subdued and grayed out, while calculated columns that are not hidden have a darker (black) header.
Data Analysis Expression (DAX) is essentially the formula language Power Pivot uses to perform calculations within its own construct of tables and columns. The DAX formula language comes with its own set of functions. Some of these functions can be used in calculated columns for row-level calculations, while others are designed to be used in calculated measures for aggregate operations.
In this section, we'll touch on some of the DAX functions that can be leveraged in calculated columns.
In the previous section, you used the Formula bar within the Power Pivot window to enter calculations. Next to that Formula bar, you may have noticed the Insert Function button labeled fx. This is similar to the Insert Function button found in Excel. Clicking this button will activate the Insert Function dialog box shown in Figure 37.7. This dialog box allows you to browse, search for, and insert the available DAX functions.
As you look through the list of DAX functions, you'll notice that many of them look like the Excel functions with which you are already familiar. Make no mistake, however; these are not Excel functions. Where Excel functions work with cells and ranges, these DAX functions are designed to work at the table and column levels.
To understand what that means, start a new calculated column on the Invoice Details tab. Click the Formula bar, and enter a good old SUM
function: =SUM([Gross Margin])
. Figure 37.8 shows the result.
As you can see, the SUM
function sums the entire column. This is because Power Pivot and DAX are designed to work with tables and columns. Power Pivot has no construct for cells and ranges. It doesn't even have column letters on its grid. Where you would normally reference a range, as in an Excel SUM
function, DAX basically takes the entire column.
The bottom line is not all DAX functions can be used with calculated columns. Because a calculated column evaluates at the row level, only DAX functions that evaluate single data points can be used in a calculated column.
A good rule of thumb is if the function requires an array or a range of cells as an argument, then it not viable in a calculated column. Therefore, functions such as SUM
, MIN
, MAX
, AVERAGE
, and COUNT
don't work in calculated columns. Functions that require only single data point arguments work quite well in calculated columns, such as YEAR
, MONTH
, MID
, LEFT
, RIGHT
, IF
, and IFERROR
.
To demonstrate the usefulness of employing a DAX function to enhance calculated columns, let's return to our walk-through example. Go to the Power Pivot window, and click the InvoiceHeader tab. If you accidentally closed the Power Pivot window, you can activate it by clicking the Manage button on the Home tab of the Power Pivot Ribbon.
The InvoiceHeader table, shown in Figure 37.9, contains an InvoiceDate column. Although this column is valuable in the raw table, the individual dates aren't convenient when analyzing the data with a PivotTable. It would be beneficial to have a column for Month and a column for Year. This way, you could aggregate and analyze your data by month and year.
For this endeavor, you use the YEAR()
, MONTH()
, and FORMAT()
DAX functions to add some time dimensions to your data model. Follow these steps:
=YEAR([InvoiceDate])
, and then press Enter.Year
.=MONTH([InvoiceDate])
and then press Enter.Month
.=FORMAT([InvoiceDate],"mmm")
and then press Enter.Month Name
.After going through these steps, you should have three new calculated columns similar to the ones shown in Figure 37.10.
As mentioned, creating calculated columns will automatically make them available through your PivotTable Fields list (see Figure 37.11).
One of the more annoying things about Power Pivot is that it doesn't inherently know how to sort months. Unlike standard Excel, Power Pivot doesn't use the built-in custom lists that define the order of month names. When you create a calculated column like [Month Name]
and place it into your PivotTable, Power Pivot will put those months in alphabetical order (see Figure 37.12).
The fix for this is fairly easy. Activate the Power Pivot window and select the Home tab. There you will find the Sort by Column button. Upon clicking the button, the Sort by Column dialog box shown in Figure 37.13 opens.
The idea is to select the column you want sorted and then select the column by which you want to sort. In this scenario, we want to sort Month Name by month.
Once you confirm the change, it will initially appear as though nothing happened. This is because the sort order that you defined is not for the Power Pivot window. The sort order is applied to your PivotTable. You can flip over to Excel and see the result in the PivotTable (see Figure 37.14).
Sometimes the operation you are trying to perform with a calculated column requires you to utilize fields from other tables within the Power Pivot data model. For instance, you may need to account for a customer-specific discount amount from the Customers table (see Figure 37.15) when creating a calculated column in the InvoiceDetails table.
To accomplish this, you can use a DAX function called RELATED
. Similar to VLOOKUP
in standard Excel, the RELATED
function allows you to look up values from one table to use them in another.
Take a moment to follow these steps to create a new calculated column that displays a discounted revenue for each transaction in the InvoiceDetails table:
=RELATED(
As soon as you enter the open parenthesis, a menu of available fields (shown in Figure 37.16) will be displayed. Note that the items in the list represent the table name followed by the field name in brackets. In this case, we're interested in the Customers[Discount Amount]
field.
Discount%
.=[UnitPrice]*[Quantity]*(1-[Discount%])
and then press Enter.Discounted Revenue
.The reward for your efforts will be a new column that uses the discount percent from the Customers table to calculate discounted revenue for each transaction. Figure 37.17 illustrates the new calculated column.
In the previous example, you first created a Discount% column using the RELATED
function and then used that column in another calculated column to calculate the discounted revenue.
It's important to note that you don't necessarily have to create multiple calculated columns to accomplish a task like this. You could, instead, nest the RELATED
function into the discounted revenue calculation. This would be the syntax for the nested calculation:
=[UnitPrice]*[Quantity]*(1-RELATED(Customers[Discount Amount]))
As you can see, nesting simply means embedding desired functions within a calculation. In this case, instead of using the RELATED
function in a separate Discount% field, you can embed it directly in your discounted revenue calculation.
Nesting functions can definitely save time and even improve performance in larger data models. On the other hand, complicated nested functions can be harder to read and understand.
You can enhance the functionality of your Power Pivot reports with another kind of calculation called a calculated measure. Calculated measures are used to perform more complex calculations that work on an aggregation of data. These calculations are not applied to the Power Pivot window like calculated columns. Instead, they are applied directly to your PivotTable, creating a sort of virtual column that can't be seen in the Power Pivot window. You use calculated measures when you need to calculate based on an aggregated grouping of rows.
Imagine you wanted to show the difference in unit costs between the years 2007 and 2006 for each of your customers. Think about what technically has to be done to achieve this calculation. You'd have to figure out the sum of unit costs for 2007, then you'd have to get the sum of unit costs for 2006, and finally you'd have to subtract the sum of 2006 from the sum of 2007. This is a calculation that simply can't be done using calculated columns. Using calculated measures is the only way to get the cost variance between 2007 and 2006.
Follow these steps to create a calculated measure:
In this example, we are using the following DAX formula:
=CALCULATE(
SUM(InvoiceDetails[UnitCost]),
YEAR(InvoiceHeader[InvoiceDate])=2007
)
This formula uses the CALCULATE
function to sum the UnitCost column from the InvoiceDetails table, where the Year column in the InvoiceHeader is equal to 2007.
=CALCULATE(
SUM(InvoiceDetails[UnitCost]),
YEAR(InvoiceHeader[InvoiceDate])=2006
)
We will also need a measure to calculate the variance:
=[2007 Revenue]-[2006 Revenue]
Figure 37.19 illustrates the newly created calculated measures. The calculated measures are applied to each customer, displaying the variance between their 2007 and 2006 costs. As you can see, each calculated measure is available for selection in the PivotTable Fields list.
You may find that you need either to edit or to delete a calculated measure. You can do so by following these steps:
Clicking the Edit button will activate the Measure dialog box where you can make changes to the calculation setting.
Clicking the Delete button will activate a message box asking you to confirm that you want to remove the measure. After confirming with Yes, the calculated measure will be removed.
Cube functions are Excel functions that can be used to access the data in a Power Pivot data model outside the constraints of a PivotTable. Although Cube functions are technically not used to create calculations themselves, they can be used to free Power Pivot data so that it can be used with formulas that you may have in other parts of your Excel spreadsheet.
One of the easiest ways to start exploring Cube functions is to allow Excel to convert your PivotTable into Cube functions. The idea is to tell Excel to replace all cells in the PivotTable with a formula that connects back to the Power Pivot data model.
Follow these steps to create your first set of Cube functions:
If your PivotTable contains a report filter field, the dialog box shown in Figure 37.23 activates. This dialog box gives you the option of converting your filter drop-down selectors to Cube formulas. If you select this option, the drop-down selectors are removed, leaving a static formula.
If you need to have your filter drop-down selectors intact so that you can continue to change the selections in the filter field interactively, be sure to leave the Convert Report Filters option unchecked when clicking the Convert button.
Now that the values you see are no longer part of a PivotTable object, you can insert rows and columns, you can add your own calculations, or you can combine the data with other formulas in your spreadsheet. Cube functions give you the flexibility to free your Power Pivot data from the confines of a PivotTable and use it in all sorts of ways by simply moving formulas around.