Clustering in Tableau

Tableau's power has always been its user-focused flexibility, and working with the user in order to achieve insights at the speed of thought. Tableau's clustering functionality continues the tradition of putting the user front-and-center of the analytics process. So, for example, Tableau allows us to quickly customize geographical areas, for example, which in turn can yield new insights and patterns held within the groups.

Tableau 10.0 comes with k-means clustering as a built-in function. K-means is a popular clustering algorithm that is useful, easy to implement, and it can be faster than some other clustering methods, particularly in the case of big datasets.

We can see the data being grouped, or clustered, around centers. The algorithm works firstly by choosing the cluster centers randomly. Then, it works out the nearest cluster centers, and arranges the data points around it.

K-means then works out the actual cluster center. It then reassigns the data points to the new cluster center. These steps are repeated until the data. The filled shapes represent the center of the cluster.

How does k-means work?

K-means procedure splits the data into K segments. Each segment has a centroid that corresponds to the average value for the marks in that segment. The objective of the k-means procedure is to place the centroids so that the total of the sum of distances between centroids and marks in respective segments is as small as possible.

How to do Clustering in Tableau

In order to create clustering in tableau we need to follow the next:

  • In Tableau, go to the Analytics pane on the left-hand side
  • Drag Cluster from the Analytics pane onto the current canvas view

You can explore your data by dragging the cluster in and out of the pane so that you can compare.

Clustering feature has a describe dialog that gives you summary statistics for each cluster to help you to understand how Tableau has obtained the results with the clustering process.

Creating Clusters

To find clusters in a view in Tableau, follow these steps:

  1. Create a view.
  2. Drag Cluster from the Analytics pane into the view, and drop it on the target area in the view:

    You can also double-click Cluster to find clusters in the view.

    Two things happen when you drop or double-click Cluster:

    • Tableau adds Clusters on Color, and colors the marks in your view by cluster
    • Tableau displays a Clusters dialog box, where you can customize the cluster:
    Creating Clusters
  3. Customize the cluster results by doing either of the following in the Clusters dialog box:
    • Drag new fields from the Data pane into the Variables area of the Clusters dialog box

      When you add variables, measures are aggregated using the default aggregation for the field; dimensions are aggregated using ATTR, which is the standard way that Tableau aggregates dimensions.

    • Specify the number of clusters. If you do not specify a value, Tableau will go as high as 25 clusters in trying to determine the number of clusters. If you specify a value, you can choose any number between 2 and 50.
  4. When you finish customizing the cluster results, click the X in the upper-right corner of the Clusters dialog box to close it:
    Creating Clusters

Note

You can move the cluster field from Color to another shelf in the view. However, you cannot move the cluster field from the Filters shelf to the Data pane.

To edit Clusters you have previously created, right-click (Control-click on a Mac) the Clusters field on Color and choose Edit clusters.

Or for an example showing the process of creating clusters with sample data (world economic indicators), see Example - Create Clusters from World Economic Indicators Data here: http://onlinehelp.tableau.com/v10.0/pro/desktop/en-us/clustering_example.html.

If you aren't sure how many splits to use, there's no need to worry! As you know, Tableau already makes things very easy for you, by proposing the correct data visualization for you. Well, Tableau also makes analytics easier for you, by recommending the number of splits that you need. This is particularly helpful if you are simply exploring the data.

Tableau is flexible, and it offers you the ability to specify your own clustering settings. For example, you can stipulate the number of clusters that you would like to create. This is a process that is similar to creating bins in Tableau. Bins have to be an equal size, however clustering allows you the flexibility to have varying sizes of clusters. This may be the preferred option if you have business reasons for wanting to combine things a little differently, and Tableau has the flexibility of allowing you to specify the number of clusters, or finding that information for you.

Once you identify the clusters, you can assign them more intuitive names based on the summaries for each cluster (which, in this case, can be seen in the scatter plots). In this example, we have three clusters corresponding to developed, developing, and underdeveloped countries based on the four metrics used as the clustering criteria.

You can use clustering results in other visualizations such as dimensions. You can even manually override cluster assignments if you have external domain knowledge that you want to incorporate into the results.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset