Categories of clustering algorithms

There are numerous clustering algorithms available off the shelf in R. However, all these algorithms can be grouped into one of two categories:

  • Flat or partitioning algorithms: These algorithms rely on an input parameter that defines the number of clusters to be identified in the dataset. The input parameter sometimes comes up as input from business directly or it can be established through certain statistical methods. For example, the Elbow method.
  • Hierarchical algorithms: In these kinds of algorithms, the clusters are not identified in a single step. They involves multiple steps that run from a single cluster containing all the data points to n clusters containing single data point. Hierarchical algorithms can be further divided into the following two types:
    • Divisive type: A top-down clustering method where all points are initially assigned to a single cluster. In the next step, the cluster is split into two clusters which are least similar. The process of splitting the clusters is recursively done until each point has its own cluster, for example, the DIvisive ANAlysis (DIANA) clustering algorithm.
    • Agglomerative type: A bottom-up approach where, in the initial run, each point in the dataset is assigned n unique clusters, where n is equal to the number of observations in the dataset. In the next iteration, most similar clusters are merged (based on the distance between the clusters). The recursive process of merging the clusters continues until we are left with just one cluster, for example, agglomerative nesting (AGNES) algorithm.

As discussed earlier, there are numerous clustering algorithms available and we will focus on implementing projects using one algorithm for each type of clustering. We will implement project with k-means that is a flat or partitioning type clustering algorithm. We will then do customer segmentation with DIANA and AGNES, which are divisive and agglomerative, respectively.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset