In this section, we will discuss the distribution-based clustering technique and its computational challenges. An example of using Gaussian mixture models (GMMs) with Spark MLlib will be shown for a better understanding of distribution-based clustering.