Chapter 4
Creating Calculations to Enhance Data

Each new paradigm gives us the opportunity to “see” phenomena that were before as invisible to us as the colors of the sunset to the fog.

—Benjamin and Rosamund Zander1

Tableau provides different methods for making calculations to enhance your data through the creation of new fields that don’t exist in your data source. You can also turn single-purpose dashboards and views into multi-purpose analysis environments though the use of parameter controls. Parameters are formula variables that can be used to provide filter-like controls that allow users to change the measures and dimensions used in a dashboard or worksheet. Starting with Tableau V9.0, an entirely new class of functions was added that eases control over the granularity of the data expressed in your views. Level of detail (LOD) expressions lower the technical bar for achieving effects that formerly required advanced SQL-scripting skills.

In this chapter, you will learn how to use Calculated Fields and Table Calculations to derive facts and dimensions that don’t exist in your source data. Tableau’s formula editing window will be explained as well as the Quick Table Calculation menu and how to modify the calculation defaults to address your specific needs.

You will learn about parameter controls—basic and advanced—so that you can make views that address different needs using the same basic visual design. Level of detail (LOD) expressions are covered at the end of the chapter.

Tableau makes formula creation as easy as it can possibly be, but it helps to understand the concept of aggregation and the functions and operators that are available to use before you start making formulas. For those of you who want to dive deeply into Tableau’s functions, Appendix E provides in-depth coverage of every Tableau function and level of detail expressions, with basic, intermediate, and advanced examples.

What Is Aggregation?

Aggregation defines how values are expressed. Most Tableau functions are calculated at the database server with only the results being sent to Tableau. If you are familiar with SQL, you will find many of the functions in Tableau are an extension of SQL. Tableau uses the Sum aggregation by default. If the default aggregation isn’t what you want, you can change it per view or change the default aggregation for the field. To change it in a view, point at the pill of the measure you’ve placed into the view—right-click, and select a more appropriate aggregation. To change the default settings, right-click the field in the Measures pane and select Defaults Properties ⇒ Aggregation ⇒ {select a new default format}. Supported aggregation types include:

  • Sum
  • Average
  • Median
  • Count
  • Count Distinct
  • Minimum
  • Maximum
  • Percentiles
  • Standard Deviation
  • Standard Deviation of a Population
  • Variance
  • Variance of a Population

These are defined in Tableau’s online manual. Search the help menu to read more about each of them. Note that in older versions of Tableau, Desktop Count Distinct and Median aggregation types were not supported when using a direct connection to Excel, Access, or text files. The data had to be extracted into Tableau’s data engine. Beginning in Tableau V9.0, extracts are no longer required.

Figure 4-1 shows a text tables in the Sample – Superstore Sales dataset, displaying all of the different aggregations available for the sales field in the dataset.

c04f001.tif

Figure 4-1: Different aggregation of sales

Notice that the bottom four rows are expressing Count Distinct values for different dimensions. By dragging each of those dimension fields into the text table using the right mouse button (Option+drag a Mac), the Count Distinct aggregation can be expressed for each dimension. As you can see, the dataset includes 6,455 different orders, 1,424 cities, 49 states, and 4 regions. This view was created using a direct connection to an Excel spreadsheet.

Dimension versus Attribute

Aggregation behavior can be changed by altering the default method by which Tableau expresses dimensions. Figure 4-2 shows a text table containing sales by product category and subcategory. A table calculation is being used to display the percent of total sales that each row represents within each product category pane.

By default, Tableau partitions the result by the category dimension. Subtotals have been added by using the main menu option Analysis ⇒ Totals and then showing subtotal and column totals. The amount of sales and percent of sales are totaled within each category pane. But, if the category dimension is changed to an attribute, the category dimension will become a label only and no longer cause the data to be partitioned. Figure 4-3 shows the same dataset but with the category field changed to an attribute.

c04f002.tif

Figure 4-2: Product category as a dimension

c04f003.tif

Figure 4-3: Product category as an attribute

The view still shows the light gray boundary lines between each category, but because the category dimension has been changed to an attribute, it no longer partitions the view. The sales total reflects the total for the entire text table and the percent of total sales is now expressing the percentage of total sales, not the sales within each category. This may appear to be trivial, but as your skills advance and you begin to employ more advanced table calculations, you will need to understand how attributes change Tableau’s behavior.

What Are Calculated Fields and Table Calculations?

Calculated Fields and Table Calculations allow you to add new data to your Tableau workbook, but the way you add the data, and where the calculations occur, is different for each method.

Calculated Fields are defined by entering a formula into Tableau’s formula editing dialog box. For example, if you have gross margin dollars and sales dollars in your source data, you may want to add a new field called Gross Margin Percent by creating a calculated value. The formula to create the gross margin percent is sum([gross margin dollars])/sum([sales dollars]).

The Sum aggregation function in front of each field name tells the source database what to return to Tableau. Table calculations are created in a different way—using your data visualization as the source for the formula.

Predefined Quick Table Calculations remove the need for you to create the formula manually, but these are always processed locally because they rely on the data presented in your view to derive the result.

Calculated Fields can also include Table Calculation functions. These are functions you use in Calculated Fields that are processed locally just like Quick Table calculations.

How Do Calculated Fields Work?

Calculated Fields are normally (but not always) executed at the database level, where the heavy-lifting happens is dependent on the type of functions utilized in the formula. Calculated Fields can be used to generate numbers, dates, date-times, strings, or Boolean (true/false) conditions. Formulas need these elements:

  • Functions: Including aggregate, number, string, date, type conversion, logical, user, and table calculation types
  • Fields: Selected from the data source
  • Operators: For math and comparison of values, dates, and text
  • Optional elements: Can be added within the formula dialog box including:
    • Parameters: For creating formula variables that are accessible to information consumers
    • Comments: For documenting formula syntax and notes within the formula dialog box

There are two ways to create a Calculated Field: by initiating a formula dialog or via ad hoc calculations. People experienced at writing SQL script or creating spreadsheet formulas normally have very little difficultly learning how to write formulas in Tableau. Those with very little experience writing formulas may need more help. Tableau provides assistance via a real-time Calculation Editor and a help window in the formula editing window, as well as an online manual that is accessible from the editing window.

Creating Calculated Fields with the Calculation Editor

Start the formula dialog box via the main menu using the Analysis ⇒ Created Calculated Field or by right-clicking on a field. In the formula dialog box, you enter the functions, operators, and parameters to create the logic for your formula. Figure 4-4 shows the required menu options.

c04f004.tif

Figure 4-4: Menu for creating Calculated Fields

Alternatively, right-clicking a field in the Dimensions or Measures shelves opens the formula dialog box you see in Figure 4-5.

This alternative saves a little time because Tableau will automatically insert the field into the window.

c04f005.tif

Figure 4-5: Right-click to create a Calculated Field.

Performing Ad hoc Calculations

Ad hoc calculations are started by double-clicking within a shelf and typing as you see in Figure 4-6.

c04f006.tif

Figure 4-6: Ad hoc calculation

Initiating the process of writing a formula this way is similar to writing a formula in a spreadsheet. This new Tableau V9.0 facility keeps you in the flow of your analysis.

How Do Table Calculations Work?

Table calculations are derived from the structure of the data included in your visualization, so table calculations are dependent on the source worksheet view contained in your workbook. That means these calculations are always derived locally using your personal computer’s processor to return the result.

Understanding exactly how table calculations work takes a little time because table calculations can change as your visualization is altered. As with any new concept, after you create some table calculations, you’ll get comfortable with how they behave in different situations. Tableau’s online manual has a large number of examples that you can view that provide a good basic introduction.

Creating a Table Calculation requires that you have a worksheet with a visualization. A good way to create them is to right-click a measure pill used in the view to expose the Quick Table Calculation menu. Quick Table Calculations are provided for:

  • Running total
  • Difference
  • Percent difference
  • Percent of total
  • Rank
  • Percentile
  • Moving average
  • YTD total
  • Compound growth rate
  • Year over year growth
  • YTD growth

Depending on the view of the data included in your worksheet, some of these may be unavailable because your worksheet view doesn’t support the calculation. Unavailable calculations will be visible in the menu but will appear grayed out.

A Word on Calculations and Cubes

Tableau connects to relational databases, spreadsheets, columnar-analytic databases, data services, and data cubes (multi-dimensional data sources). Data cubes are different from regular database files because they pre-aggregate data and define hierarchies of dimensions in specific ways.

If you need to access pre-aggregated data that is stored in a multi-dimensional data source, you can still perform calculations using Tableau formulas or create formulas using the standard query language of multi-dimensional databases, Multidimensional Expressions (MDX). The syntax is a bit more complex, but MDX also provides the ability to create more complex formulas. If you want to learn more about options for creating calculations when accessing data cubes, refer to Tableau Software’s quick-start guide “Creating Calculated Fields-Cubes.” Tableau’s behavior when you connect it do a data cube is different because the cube controls aggregation. For example, date fields behave differently because the cube controls date aggregation in specific ways.

Using the Calculation Editor to Build Calculated Fields

Calculated Fields require that you enter fields, functions, and operators. Tableau strives to make formula creation fast and easy, so it is possible to write formulas with minimal typing. Once you’ve connected to a data source, you can create a Calculated Field from the main menu by selecting Analysis/Create Calculated Field. This example uses the Superstore for TYD2 spreadsheet. Figure 4-7 shows the Calculation Editor.

c04f007.tif

Figure 4-7: Calculation Editor

The figure shows a calculation for Profit Ratio that uses two fields from the Superstore file to derive the result. The Name field at the top of Figure 4-7 is where you type the name of your Calculated Field as you want it to appear in the date pane of the worksheet. The Formula box is used to write the script for the formula. After creating a Calculated Field, right-click it and set the default properties. In this example, you want the calculated ratio to be treated as a percentage, so select Number Format ⇒ Percentage ⇒ OK.

This Calculation Editor marks a significant departure from earlier versions of Tableau that included separate dialog boxes for fields, parameters, and functions. Tableau’s new autofill feature eliminates the need for the dialog boxes. You simply start to type the name of the desired element and Tableau’s autofill presents the available options. Or, drag fields from the data pane or a shelf into the view Calculation Editor. This method for building Calculated Fields more closely resembles how you build visualizations in Tableau Desktop by keeping you in the flow of your build.

Observe that Tableau color-encodes different elements of formulas so that they are easy to separate visually. Fields are orange, parameters are purple, and functions are blue. Notice the example in Figure 4-7 includes comments at the top, color-encoded in green. Comments are useful for documenting sections of complex formulas or for adding basic descriptive information to other analysts that may use your formula in their work. You can add comments anywhere in the Calculation Editor by typing two forward slashes (//) in front of the text.

If your formula syntax is correct, you will see gray text at the bottom of the window that confirms that the calculation is valid, as you see in Figure 4-8. If you make a mistake, the Calculation Editor will help you find the cause.

c04f008.tif

Figure 4-8: Detecting formula errors

Hover over the squiggly red line in the formula or click the drop-down box at the bottom to see the details. Forgetting to add an operator is a common mistake. Figure 4-8 shows how Tableau helps you identify the missing syntax from your formula.

Ad hoc Calculated Fields

Creating formulas using Tableau’s ad hoc formula editor is similar to building formulas in a spreadsheet. To build a formula this way, double-click the Rows shelf, Columns shelf, or Marks card. Figure 4-9 shows an ad hoc formula for profit ratio.

As you become more familiar with Tableau functions, you will probably gravitate toward ad hoc calculations because it’s the fastest way to create a formula. If you decide that you want to reuse an ad hoc calculation, drag its pill to the data pane and name it, as you see in Figure 4-10.

Dropping the pill on the data pane allows you to save the new field permanently, name the field specifically, and edit the default format or data type for the field. Figure 4-10 shows the Rename Field dialog box. Edit the datatype or default number format as you would any field in the data pane.

c04f009.tif

Figure 4-9: Ad hoc Calculation Editor

c04f010.tif

Figure 4-10: Naming an ad-hoc calculation

As you are adding new Calculated Fields, using the ad hoc calculations, you can see the result of your work before saving it by hovering your cursor over the pill. This will make the small Apply window you see in Figure 4-11 appear.

Selecting the Apply option allows you to preview the result of the Calculated Field in the view as you see in the view area of Figure 4-11. Ad hoc calculations accelerate your ability to create and test new formulas.

c04f011.tif

Figure 4-11: Ad hoc calculation preview

Building Formulas Using Table Calculations

In contrast to Calculated Fields, Table Calculations use the data in your visualization to create a formula. Before you can use Table Calculations, you must first create a view. Using Superstore for TYD2, Figure 4-12 displays a time series of monthly sales on top. The bottom half employs a quick table calculation to derive the running total of sales for each year. Notice the second SUM(Sales) pill contains a small triangle. This icon denotes the pill is a table calculation.

c04f012.tif

Figure 4-12: Time series using a running sum

The following are the steps required to build the charts in Figure 4-12:

  1. Add Order Date (discrete month) to the Columns shelf.
  2. Add sales to the Rows shelf.
  3. Filter the order date for the year(s) 2014 and 2015.
  4. Add the order date to the Color Marks button.
  5. Format the lower axis selecting Dates: Abbreviation.

The data from the time-series chart will serve as the data source for the table calculation that will be used to create the chart in the bottom half of Figure 4-12. That chart displays the running sum of sales for each month within the displayed years. The following are the steps required to add that portion of the view:

  1. Duplicate the sales pill on the Rows shelf to create a duplicate chart (Ctrl+drag and drop).
  2. Right-click (or pick the drop-down arrow on the right side of the second sales pill).
  3. Select Quick Table Calculation—Running Total.
  4. On the Marks card, click the bottom Sum(Sales) to modify that specific chart.
  5. Turn on field labels for the Line Ends and un-check label start of line.

Figure 4-13 shows how right-clicking the duplicate sales pill exposes the Quick Table Calculation menu.

c04f013.tif

Figure 4-13: Creating the Quick Table Calculation

Selecting Running Total generates the table calculation that is plotted in the lower half of the view in Figure 4-13. The number format for the mark label was also changed to display the results in thousands in the top chart and millions in the lower chart with the decimal places set to zero for both. To format the numbers, point at the number, right-click, and select the desired number format from the formatting menu that appears. Only 30 seconds were required to build this chart.

Editing Table Calculations to Suit Your Purpose

You can also see in Figure 4-13 that there are many other Quick Table Calculation options available. There is also a menu option called Edit Table Calculation. You can also customize table calculations by selecting the Edit Table Calculation menu as well.

Understanding how Table Calculations work takes a little time—playing with the options and looking at the results. Take a close look at the Table Calculation dialog box displayed in Figure 4-14.

c04f014.tif

Figure 4-14: The Table Calculation dialog

Table Calculations require selections of following options:

  • Calculation Type: As shown in Figure 4-13
  • Summarize values using: Sum, average, median (these will change depending on the content of your source)
  • Running along: Defines the direction and the scope of the table calculation (Table Across, Table Down, and so on)

Modifying the Month (Order Date) field to show time as discrete quarters and months creates quarterly partitions, as shown in Figure 4-15.

c04f015.tif

Figure 4-15: Using discrete quarter and month

The bottom time series showing the Running Sum of Sales still uses Table Across to calculate the total. Notice the labels at the end of each quarter reflect the running total sales for the entire table. Right-clicking on the table calculation pill (identified by a small triangle on the right side of the pill) and selecting the Edit Table Calculation menu exposes the Table Calculation control. Figure 4-16 shows the Table Calculation editing menu for Compute Using Pane (Across) and includes more options. Adding the partition for quarter creates quarterly panes that revise the scope of the table calculation used in the bottom half of the view.

Changing the scope of the calculation to Pane Across causes the running sum calculation to reset every quarter (pane). The bottom half of Figure 4-16 reflects the revised scope. As you see, the running totals restart at the beginning of each quarter.

c04f016.tif

Figure 4-16: Changing table calculation scope

Understanding Table Calculation Behavior

Learning exactly how Table Calculations behave in different visualizations takes a little time. The best way to learn is to build a text table report and then start playing with different tableau calculation options to see the results. Tableau’s online manual provides many different examples. Figure 4-17 shows the percent of total table calculations using all of the different direction and scope options.

Notice that the example for the Table scope returns the same result as the Table Down Then Across scope. Also, the Cell scope is calculating the mark value of itself, resulting in 100 percent in every cell. Depending on the structure of your view, it is not uncommon for different scope options to return the same values. In general, adding more dimensions to your view will increase the number of available options provided by table calculations. Experiment with different visualizations and test your results. With practice, you’ll be able to anticipate how they behave.

Reusing and Customizing Table Calculations

Quick table calculations don’t automatically result in any new fields appearing in the data pane. What if you want to use the result returned by a Quick Table Calculation on another worksheet? Is this possible? It is. The method for doing this has changed. Prior to V9.0, the table calculation editing window contained a Customize button. Clicking that created a new field and generated script containing table calculation functions.

Starting in V9.0, the Calculation Editor no longer includes this button because the new ad hoc calculation facility allows you to drag and drop table calculations from your view to the data pane. When you do that, Tableau will create a new Calculated Field that you can rename.

c04f017.tif

Figure 4-17: Comparison of different scope options

Refer to Figure 4-12. In that view a quick table calculation was used to make a Running Sum of Sales chart at the bottom half of the view. Dragging that Table Calculation from the row shelf to the data pane, as you see in Figure 4-18, allows you to add the field to the Measures pane. Notice the field is now named Running Sum Table Calc.

Once you’ve created the new field (Running Sum of Sales), you can use it in another worksheet. Figure 4-19 shows one way that it could be deployed. This new view contains year and month date granularity. Note that the table calculation in the view resets at the beginning of each year. To make this view requires that the Calculated Field—Running Sum of Sales—be revised to compute the value using Pane Across versus Table Across. Figure 4-19 shows you how.

c04f018.tif

Figure 4-18: Creating a Calculated Field from a table calculation

c04f019.tif

Figure 4-19: Using the custom table calculation in another worksheet

To build this view, place the order date on the Columns shelf, and then click the plus sign in the Year (order date) pill. After that, modify the Quarter (order date) pill to display the month by selecting the drop-down and picking the Discrete month option. Next, place Sum (Sales) and the running sum Calculated Field on the Rows shelf. Customize the running sum to compute using the Pane Across option by selecting the drop-down arrow in the pill (or right-clicking) and picking the desired option indicated in Figure 4-19. Format the bottom axis by right-clicking it and selecting Format ⇒ Header ⇒ Dates: ⇒ First letter.

Secondary Table Calculations

Secondary table calculations allow you to pass the result of an initial table calculation to a second table calculation to derive a result. In the next example, you use the World Indicators dataset that ships with Tableau desktop as your data source. That file includes country Gross Domestic Product (GDP) data. You’ll see how a secondary table calculation can be used to enhance analysis of GDP by country.

An initial view of the data shows the GDP information in a bar chart, sorted by descending GDP. You can see the result in Figure 4-20.

c04f020.tif

Figure 4-20: GDP by country

Looking at Figure 4-20, you get a quick sense of the top countries in terms of GDP. Notice that the view is filtered for the year 2012 and that the dataset includes 185 records. The total reported GDP for all reporting countries is $72,188.08B ($72 trillion). How could this be enhanced using table calculations?

  • Create a table calculation that computes the running total GDP by country and another that computes the running cumulative percent (%) of global GDP.
  • Add an INDEX table calculation function to show a ranking of the row.
  • Make a quick filter that utilizes the INDEX function to filter the view.
  • Apply these three items provided to the details for the view you see in Figure 4-21.
c04f021.tif

Figure 4-21: GDP by Country enhanced

Note the cumulative percentage values on the right side of Figure 4-21. That bar chart was created using as secondary table calculation. The ranking number on the left side of the few was provided by an Index Table Calculation function, and the quick filter also utilizes the index function.

To create the Cumulative Percentage bar chart, start by dragging GDP from the Data Shelf Measures area to the Columns shelf, and then edit the aggregation to be SUM by selecting the drop-down arrow in the pill. Then edit the pill to be a Quick Table Calculation to show the running total GDP. Next, edit the Table Calculation by clicking the drop-down again and selecting the Edit Table Calculation menu option you see in Figure 4-22.

c04f022.tif

Figure 4-22: Creating the secondary table calculation

Clicking the check box to perform a secondary calculation exposes the lower section of the table calculation window. Define the secondary table calculation for the percent of table there. Notice both table calculations define a table (down) summary. Try other summarize options to see what is available. Computer Using ⇒ Country would also work in this example.

If you create more advanced views that involve multiple dimensions, you can use the advanced options to define not only the type of calculation but how it should restart. See Chapter 7 for an example of this being used in a Pareto chart.

After defining the secondary table calculation, your chart should look like Figure 4-21 but will not include the ranking numbers on the left. A table calculation function can be used to add that data.

Using Table Calculation Functions

The index function used in Figure 4-21 is a table calculation function that counts the position of a row or column in a view. To add this measure to the view, you can type the formula on the Rows shelf or you can go to the Analysis menu to build the formula. If you are familiar with Tableau formula-writing, the ad hoc method is faster. The menu option provides more help. Initiate the calculation by going to the Analysis ⇒ Create Calculated Field menu to expose the calculation window you see in Figure 4-23.

c04f023.tif

Figure 4-23: Using the index table calculation function

Name the field Index of Row. You can see that the formula syntax is very simple. Using the Enter Text to Search dialog box at the right of Figure 4-23, you can search for a specific function or filter the list for a specific class of functions. The list has been filtered for table calculation functions. Selecting the index function displays a brief explanation of the function. If you want to get more detailed explanations, refer to the Tableau online manual. In addition, Appendix E provides details on the function syntax along with code samples. Double-clicking on the function will insert it into the formula dialog box. The index function requires no additional arguments. Clicking OK creates a new measure to appear in the Data pane Measures shelf.

Add the measure to the shelf. When you do that, your view will change. Don’t panic. Point at the Index of Row pill, select the drop-down, and change the measure type from Continuous to Discrete, as you see in Figure 4-24. Reposition the pill to the left of Country on the Rows shelf.

Now your view should display the country ranking. Figure 4-21 also includes a quick filter that uses the same calculated value. To add that quick filter to the view, drag the Index of Row pill while holding the Control button (Command+Index of pill on a Mac). This will copy the field into the Filters shelf. Once the pill is in the Filters shelf, change it back to a continuous measure. The pill should be green, as you see in Figure 4-25.

c04f024.tif

Figure 4-24: Convert the Index of Row to Discrete.

c04f025.tif

Figure 4-25: Copy the Index of Row to the Filters shelf

You can see that the Index of Row calculated value has been converted to a continuous measure in Figure 4-25. To add it to the view as a quick filter, click the pill’s drop-down menu and select the Show Quick Filter option. Play with the filter to control the number of countries being displayed in the view. In Figure 4-21, the top 35 countries for 2012 are being displayed.

Adding Flexibility to Calculations with Parameters

Parameters empower information consumers to change the content that appears in worksheets and dashboards. Designers have two different ways to enable parameter controls. Basic parameters are helpful when creating flexible top-down filters or adding dynamic reference lines, or in histograms to facilitate variable bin sizes. Advanced parameters provide the ability to address unique use cases and offer more flexibility. But, they do require more planning to create.

What Are Basic Parameters?

Basic parameters are variables that are provided in specific situations that reduce the number of steps required to create a parameter control. Basic parameters are available in histograms to specify the size of each bin. They are also available for creating flexible top or bottom filters, and they can be a way to make reference lines user-selectable. Figure 4-26 shows examples for these three use cases.

c04f026.tif

Figure 4-26: Basic parameter controls

The histogram on the top of Figure 4-26 displays order counts by the size of orders. The Sales Bin Parameter allows end users to change the size of each bin. The Parameter Size Range is from $500 to $10,000. The parameter is currently displaying bin sizes of 1,500. The bullet graph in the lower left of Figure 4-26 compares sales (bars) to prior year sales (black reference lines) for every product name. The dataset includes over 1,000 product names. The Parameter–Top N Filter enables the user to change the number of products displayed in the bar chart. The top 15 products are being displayed in Figure 4-26. The scatter plot in the lower right includes a reference line called Profit Threshold that allows users to change the threshold value affecting the position of the reference line and the corresponding color of the shading below the reference line. It also displays the profit threshold value currently being used to draw the reference line –5,300.

All of these are basic parameters that are selectable options from menus associated with dimension fields for the histogram and the bullet graph (sales bin and product name). The reference line parameter used in the scatter plot was defined when the reference line was added to the view.

To edit the existing histogram’s bin size right-click the bin field name that appears in the Dimensions shelf or select the drop-down arrow in the Sales Bin Parameter and select Edit Parameter. To set up the flexible filter in the bullet graph, place Product Name on the Filters shelf, select the top tab, check the By field, expand the value drop-down, and select Create a New Parameter in the filter dialog box.

The Reference Line parameter is accessed when adding the reference line by clicking the Value drop-down selector and picking the Create a parameter option. Figure 4-27 shows the Edit menus for all three parameter controls used to create the controls displayed in Figure 4-26.

c04f027.eps

Figure 4-27: Parameter definition windows

Basic parameters are easy to create but they are also limited to the specific use cases you see in Figure 4-26: bin sizing, top or bottom filters, or flexible reference lines. If you want to create more advanced parameters, these require a little more effort.

What Are Advanced Parameters?

Advanced parameter controls are limited only by your imagination. You can create multiple parameter controls. Parameter controls can be chained together to create linked parameters. An entire book could be written on parameter controls because they provide programming-like functionality to visualizations. Creating advanced parameter controls requires three or four steps:

  1. Create the parameter control.
  2. Expose the parameter control on the desktop.
  3. Use the parameter in a Calculated Field (optional).
  4. Use the Calculated Field in the view.

If the parameter is being directly placed in the visualization, it may be unnecessary to create a Calculated Field. The key point is that whatever the parameter is being used to change (typically a formula variable), that item must be used somehow in the visualization for the parameter control to work.

One of the most popular use cases for advanced parameter controls is to enable users to change measures or dimensions being displayed in a single view. The technique in either case is the same. Figure 4-28 shows time series charts in which a parameter is being used to change the measure being plotted.

The parameter control appears in the lower right of the Time Series charts shown in Figure 4-28. Notice that the title of the worksheet includes the parameter, and the left axis label also changes depending on the measure in view.

c04f028.tif

Figure 4-28: Parameterizing the measure displayed in a view

You add a Parameter Description to the title bar by double-clicking the title bar and selecting the parameter used in the view. To add the parameter name to an axis, drag the parameter from the parameters shelf to the axis. Then edit the axis and erase the static title. To erase the axis label, point at the white space in the left axis, right-click, and then erase the contents of the title editing area. Note that this technique works only if the parameter names are text.

This example also rotated the parameter label so that it appears vertically oriented. When a new selection is made from the parameter control, the time series chart will change along with the headings and reference line to reflect the selected parameter value.

Creating the Parameter Control

This is done directly in the formula editing window or by right-clicking blank space in the dimension, measures, or parameter panes and selecting Create ⇒ Parameter. Doing that exposes the dialog box that is used to define the parameter as you see in Figure 4-29.

c04f029.tif

Figure 4-29: Defining a parameter control

Enter the name of the parameter as you want it to appear in the control that is placed on the desktop, and then define the data type. Parameters can be numbers (floating decimal point or integers), strings, Boolean (true/false), and date or date and time values.

In the List of Values section, you define the variables that will be contained in the parameter control. Figure 4-29 shows a list of measure names defined. While it isn’t always desirable, I suggest that for this type of parameter you copy the field names of the measures exactly. This will make formula creation easier in the next step. However, if you find that the performance of your parameter is not good, use numbers in a series (1, 2, 3 . . .) as your value names in the parameter definition. It makes creating the formula in the next step a little more difficult; using numbers in the parameter definition will generally result in a more responsible parameter control. This is especially noticeable with larger datasets.

Notice that there is a Display As option. This is used to create a name alias that will appear instead of the actual field name. The options to the right of the List of Values section are not applicable to this example but are useful for cases where you might be using values from another parameter control or adding members of a particularly large set. To complete the formula definition, click OK, and the parameter will appear in the parameter pane.

Expose the Parameter in the Workspace

In order for users to access the Parameter Control, it needs to be placed on the desktop. To do this, right-click the parameter name appearing in the Parameters shelf and select the Show Parameter Control menu option.

If you access the parameter now, nothing will happen because you haven’t used the control in a formula or in any other way in the visualization. The next step is to use this parameter as a variable in a formula.

Create a Formula That Uses the Parameter Control

In Figure 4-28, the Parameter Control is used to change the measure being plotted in the time series chart. This requires a formula that will link the string values defined in the parameter to measure field names in the data source. You can see the formula definition in Figure 4-30.

The formula logic associates the selected parameter string with the related field name. This is why it is a good idea to define the parameter string names to exactly match the field names you want to associate. It makes writing the formula easier if you are in learning mode. Keep in mind that if performance degrades, using sequentially ordered numeric values in the parameter definition will result in the best performance. Note that parameters are denoted by the purple color in the Calculation Editor. Clicking OK adds the Calculated Field to the Measures shelf.

c04f030.tif

Figure 4-30: Using a parameter in a formula

Giving your parameter the same name as the related calculation makes it easier to retrace your work at a later date if you need to modify the Parameter Control to add or delete items.

Use the Calculated Field in the View

Dragging the Select Measures calculated value to the Rows shelf will activate the Parameter Control. Each selection made in the parameter will trigger changes in the select measure formula and will change the measure being displayed in the time series. Refer to Figure 4-28 to see three instances of the view, each displaying a different measure.

There are many different ways that you can use advanced parameters. The limit is your imagination. For more examples, go to Tableau Software’s website and search for parameters. You should find many different forum posts, training videos, and sample workbooks related to parameters.

Why You Should Learn Level of Detail Expressions

Normally the granularity of the aggregation of measures expressed by Tableau is controlled by the dimensions used to create your visualization. Dropping dimensions on the Rows shelf, Columns shelf, or the Marks card exposes greater detail. Table calculations and reference lines have always provided a way for summarizing views. But they require that the view be structured to support the desired calculation.

Dashboards provide another way to expose different levels of detail. But what if you have limited space or very particular and complicated needs? Sometimes it is useful to be able to express one level of detail in a view but use a different level of detail for calculations to add more information. Introduced in Desktop V9.0, Tableau is providing a new class of functions for controlling the granularity of calculations. Level of detail expressions require a little practice to learn. But the syntax is much easier than the SQL scripting you would have to use if they didn’t exist.

Level of Detail Expression Syntax Explained

Level of Detail expressions are not difficult to write, but they do have specific syntax requirements. Understanding exactly what Tableau does with the LOD expression also requires some understanding of your dataset. Figure 4-31 shows a Level of Detail Expression.

c04f031.eps

Figure 4-31: INCLUDE LOD expression

The INCLUDE expression at the start of the formula in Figure 4-31 sums the sales in the view for product name or higher-level aggregates included in the view. Note the curly brackets enclosing the expression. LOD expressions must be wrapped in curly brackets.

You can add more formula logic outside of the curly brackets to specify additional calculations. Placing additional dimensions in the expression is allowed, but you must separate the field names with commas. The best way to get a handle on these expressions is to look at examples using each type of LOD operator—FIXED, EXCLUDE, and INCLUDE.

The FIXED Expression

Applying a FIXED level of detail expression allows you to define the level of detail of the formula contained within the LOD expression. It will not vary regardless of the dimensions placed in your view. Figure 4-32 shows an example.

c04f032.tif

Figure 4-32: Fixed LOD expressions

The visualization in Figure 4-32 shows four different values. The left bar is plotting total sales for the State or Province and Product Category. (Notice these fields on the Rows shelf.) Then the Number of Records for the combination of State and Category are displayed. The third column shows the result of the FIXED LOD expression (8,951,931) showing the sales value for the entire data set. Even though the view includes the state and product category dimensions, the FIXED expression always returns the total sales for every record in the dataset. The result would be altered only if you were to apply a data extract that filtered out data, applied a file to the data source connection, or you added a context filter in the view.

The chart on the far right of Figure 4-32 shows the FIXED LOD expression being used in another calculated value to derive the percent of sales of total sales that the row represents. The California/Furniture sales of $517,378 represent 5.78 percent of all Superstore for TYD2 sales. Notice that the view has been filtered to include four states (Alabama, Arizona, California, and Georgia), but the FIXED LOD expression still includes all sales from everywhere. This is because dimension filters, measure filters, and table calculation filters are applied after the FIXED level of detail expression is calculated and returned.

FIXED LOD expressions are useful for computing the percent of total details. While you can create visualizations and dashboards to provide these details, the LOD expression makes this very easy to do in a single chart. Rather than showing the calculation in the view, you could include it in the tooltip as text by moving the FIXED Expression and the % Line to Fixed Expression calculations to the tooltip button on the Marks card.

The EXCLUDE Expression

Using the EXCLUDE level of detail expression allows you to define what to omit from the calculation. This expression can be used in low-level calculations using fine-grain details where there is nothing to omit. It provides a way to express the view level or any level of detail above the view level. Changing what appears in your view can change the values being calculated. Figure 4-33 provides an example.

c04f033.tif

Figure 4-33: EXCLUDE LOD expression

The visualization is similar to the view displayed in Figure 4-32, but because the EXCLUDE LOD expression is being used, the result is different. The sum of sales displayed for the combination of state and product category chart on the left side of the view is unchanged, with the exception of the sort order of the product categories within each state. The third column bar chart showing the EXCLUDE expression now displays total sales for each product category in the four states filtered in the view (Alabama, Arizona, California, and Georgia). The % Line to EXCLUDE Sales at the far right is now calculating the percentage of total product category sales that the specific state represents within the group of states filtered for in the view. Figure 4-34 provides summary data for comparison.

c04f034.tif

Figure 4-34: Total Category Sales and Excluded Category Sales

The left text table in Figure 4-34 summarizes all sales records in the Superstore for TYD2 dataset by product category. The right text table shows the total sales by product category for only those states included in the view in Figure 4-33 (Alabama, Arizona, California, and Georgia). Compare these figures to the amounts returned in Figure 4-34.

The first column in Figure 4-34 displays California furniture sales only. Similarly, the 223 units displayed in the Number of Records column represents the row count of the California-only furniture sales. The three columns containing the EXCLUDE Expression $653,343 is the total furniture sales for the states displayed in the view (Alabama, Arizona, California, and Georgia). The far-right column, which shows the % Line to EXCLUDE Sales, calculates the percentage of sales that California represents of the same four states in the view ($517,378/$653,343 = 79.2%).

Changing the filter to include more states will change the amounts returned in the third and fourth columns that are fed by the LOD expressions, unlike the FIXED expression used to create Figure 4-32. Try adding the FIXED expression to the view in Figure 4-34 to compare the results returned by the FIXED and EXCLUDE LOD expressions.

The INCLUDE Expression

The INCLUDE LOD expression allows you to create formulas that consider more granular dimensions than the visualization displays. Similar to the EXCLUDE expression, amounts calculated using INCLUDE can change depending on what dimensions are exposed in the view. Figure 4-35 includes a bar chart of sales by state and the postal code with the maximum sales for that state. To the right of the bar chart uses the INCLUDE expression for the bar chart displaying sales by postal code for the state of Georgia. Both visualizations are filtered for the year 2015.

c04f035.tif

Figure 4-35: INCLUDE Max Sales Postal Code

The goal for the bar chart is to display the total sales for the state and the sales value for the postal code with the highest sales within that state. Even though the level of detail displayed in the bar charts is Region (west/south), the INCLUDE LOD expression provides a way to display the postal code that has the most sales for each state displayed. Georgia is highlighted. The total sales in 2015 for Georgia are $74,019. The postal code maximum sales in 2015 for Georgia are $17,043. This is confirmed by the text table display at the right with postal code 30318 returning the top sales value for the year 2015.

It will take you a while to get a feel for the behavior of LOD expressions. They provide precise control over the result you require without requiring that you display the level of detail in the view that you want to return. In the next section, you’ll learn how different filters can affect LOD expressions.

How Filters Affect Level of Detail Expressions

Filtering views can affect the results calculated by LOD expressions. The precise outcome is the result of the type of LOD expression and the type of filter being used. Tableau executes filters at the data source, through temporary tables, and locally based on the contents of the visualization. Table 4-1 shows the order of precedence.

Table 4-1 Order of Precedence for Filters and LOD Expressions

Filter TypeFilter Applied AtLOD Expression
Extract FiltersData sourceFilter presides
Data Source FiltersData sourceFilter presides
Context FiltersLocal temp tableFilter presides
Dimension FiltersLocallyFIXED dominant
Dimension FiltersLocallyINCLUDE/EXCLUDE subordinate
Measure FiltersLocallyAll LOD expressions dominant
Table Calc FiltersLocallyAll LOC expressions dominant

Limitations of LOD Expressions

Level of detail expressions are not supported by every data source that Tableau can connect to. As of May 2015, the following data sources do not support LOD expressions:

  • Cubes
  • Google Big Query
  • DataStax
  • Informatica Data Services
  • Microsoft Jet
  • Splunk
  • Actian Vectorwise

Other data sources are supported only through more recent releases. Search for “Data Source Constraints for Level of Detail Expressions” in Tableau Software’s online manual for the latest information.

Additional Resources on Level of Detail Expressions

For additional details and examples of Level of Detail (LOD) expressions, see the notes at the end of this chapter.

Using the Function Reference Appendix

Tableau provides online documentation of functions. The user forum on Tableau’s website is also quite good. However, many novice users have asked for a more detailed reference for Tableau functions that provides examples and explains the formula syntax in more detail. That is what you will find in Appendix E.

Functions are listed alphabetically by function type. Each function reference entry provides a short description of the function, typical use cases, and basic, intermediate, and advanced examples. I hope you’ll find the function reference a useful addition to your toolset.

In the next chapter, you learn how Tableau creates geospatial data for mapping. If your data includes country, state, or other standard geographic dimensions, you can easily plot your data in maps.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset