Tableau's power has always been its user-focused flexibility, and working with the user in order to achieve insights at the speed of thought. Tableau's clustering functionality continues the tradition of putting the user front-and-center of the analytics process. So, for example, Tableau allows us to quickly customize geographical areas, for example, which in turn can yield new insights and patterns held within the groups.
Tableau 10.0 comes with k-means clustering as a built-in function. K-means is a popular clustering algorithm that is useful, easy to implement, and it can be faster than some other clustering methods, particularly in the case of big datasets.
We can see the data being grouped, or clustered, around centers. The algorithm works firstly by choosing the cluster centers randomly. Then, it works out the nearest cluster centers, and arranges the data points around it.
K-means then works out the actual cluster center. It then reassigns the data points to the new cluster center. These steps are repeated until the data. The filled shapes represent the center of the cluster.
K-means procedure splits the data into K segments. Each segment has a centroid that corresponds to the average value for the marks in that segment. The objective of the k-means procedure is to place the centroids so that the total of the sum of distances between centroids and marks in respective segments is as small as possible.
In order to create clustering in tableau we need to follow the next:
You can explore your data by dragging the cluster in and out of the pane so that you can compare.
Clustering feature has a describe dialog that gives you summary statistics for each cluster to help you to understand how Tableau has obtained the results with the clustering process.
To find clusters in a view in Tableau, follow these steps:
You can also double-click Cluster to find clusters in the view.
Two things happen when you drop or double-click Cluster:
When you add variables, measures are aggregated using the default aggregation for the field; dimensions are aggregated using ATTR, which is the standard way that Tableau aggregates dimensions.
To edit Clusters you have previously created, right-click (Control-click on a Mac) the Clusters field on Color and choose Edit clusters.
Or for an example showing the process of creating clusters with sample data (world economic indicators), see Example - Create Clusters from World Economic Indicators Data here: http://onlinehelp.tableau.com/v10.0/pro/desktop/en-us/clustering_example.html.
If you aren't sure how many splits to use, there's no need to worry! As you know, Tableau already makes things very easy for you, by proposing the correct data visualization for you. Well, Tableau also makes analytics easier for you, by recommending the number of splits that you need. This is particularly helpful if you are simply exploring the data.
Tableau is flexible, and it offers you the ability to specify your own clustering settings. For example, you can stipulate the number of clusters that you would like to create. This is a process that is similar to creating bins in Tableau. Bins have to be an equal size, however clustering allows you the flexibility to have varying sizes of clusters. This may be the preferred option if you have business reasons for wanting to combine things a little differently, and Tableau has the flexibility of allowing you to specify the number of clusters, or finding that information for you.
Once you identify the clusters, you can assign them more intuitive names based on the summaries for each cluster (which, in this case, can be seen in the scatter plots). In this example, we have three clusters corresponding to developed, developing, and underdeveloped countries based on the four metrics used as the clustering criteria.
You can use clustering results in other visualizations such as dimensions. You can even manually override cluster assignments if you have external domain knowledge that you want to incorporate into the results.