The aggregate functions

Aggregation functions in Tableau Public are performed typically on numeric fields. In this section, we will show you how to use the default aggregations on visualizations as well as how to use them in calculated fields.

The following are the aggregate functions that are available for you to apply on a field are also available in the calculated field dialog box. Tableau Public has arranged them in order from the greatest to the least commonly used functions for the visualization as follows:

  • SUM: This adds up the values within a partition
  • Average: This sums the measure and divides it by the number of dimension members in the partition
  • Median: This provides the measure value that's halfway between the greatest and least values
  • COUNT and COUNTD: These count the number of dimension members and the number of distinct dimension members respectively; they are typically performed on dimensions rather than measures
  • Minimum and Maximum: These take the least and greatest values in the partition respectively
  • Percentile: This provides the numeric value at which a percentage of the partition that you specify falls beneath (we'll use this for several examples)
  • Standard Deviation: This is the square root of the variance for a data set, and it is the unit of measurement of distance from the mean within a data set; it is commonly represented by the Greek letter, sigma
  • Variance: This is the average distance from the mean value, and it's used to measure the distribution of a data set

    Note

    The variance and standard deviation are two of the basic measurements in statistics. Advanced statistics in Tableau Public are beyond the scope of this book, but if you understand the concepts, you will find that there are several capabilities within Tableau Public that provide rich statistical functionality.

The most basic aggregation is a sum, and it's the default for numeric fields in Tableau. The second and third most basic aggregations are counts and averages. When you add a measure to a visualization, Tableau Public automatically sums it. In the following example, which uses the data on global floods that we referenced earlier, we do the following three things to create a map:

  1. Double-click on Country in the Dimensions pane to create a symbol map.
  2. Double-click on SUM(Affected sq km) in the Measures pane to change the map into a filled map.
  3. Right-click on the Context menu for the color legend. Change it to a reversed orange-blue diverging color spectrum so that the higher numbers represent greater flood damage, as shown in the following screenshot:
    The aggregate functions

Note that on the Marks card, Affected sq km is summed. Therefore, you see the total area affected for each country in the whole data set. There is a major flaw with this method of aggregation, it is not normalized. It would be more useful to have a rate, such as the average area affected per flood, or to add context by creating an aggregation that tells you the percentage of the total area that was affected. For instance, we know that in Pakistan, there were catastrophic floods in 2010, and China traditionally also has had severe floods. However, these countries are in blue, which seemingly indicates that their floods were less severe.

Let's start changing aggregations, and thus add context and reduce the likelihood that someone will misinterpret our visualization, by changing the Measure function to an Average function. The easiest way to do this is by performing the following steps:

  • Click on your aggregation context menu.
  • Click on Measure(SUM).
  • Click on Average.

Changing this measure to an Average will average the square kilometers of all of the rows related to each country. This means that if there were three floods in the US and they affected 1 million, 2 million, and 3 million square kilometers respectively, then the average would be 2 million.

Tip

If you would like to set the default aggregation for a field, you can right-click on it in the Measures pane, click on Default Properties, select Aggregation, and select the appropriate aggregation for your needs.

The result is a visualization that shows much more appropriate context. From this, we can gauge the average severity in terms of the total area affected by each flood in every country. In the next chapter, we will rank the countries, which adds even more context and relates them to each other for the user.

We added the number of records as well as the sum of the affected square miles to the tooltip so that when you roll over a country, you see the total number of floods, the total area affected, and the average area affected by each flood, as shown in the following screenshot. Using a tooltip is a great way of adding context, which is the art of relating two data points to each other and to the consumer:

The aggregate functions
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset