We want to compose the feature extraction, preparatory activities, training, testing, and prediction activities while optimizing the best tuning parameter to get the best performing model.
The following tweet captures perfectly in five lines of code a powerful machine learning Pipeline implemented in Spark MLlib:
The Spark ML pipeline is inspired by Python's Scikit-Learn and creates a succinct, declarative statement of the successive transformations to the data in order to quickly deliver a tunable model.