Summary

In this chapter, we got an overview of Spark MLlib's ever-expanding library of algorithms Spark MLlib. We discussed supervised and unsupervised learning, recommender systems, optimization, and feature extraction algorithms. We then put the harvested data from Twitter into the machine learning process, algorithms, and evaluation to derive insights from the data. We put the Twitter-harvested dataset through a Python Scikit-Learn and Spark MLlib K-means clustering in order to segregate the tweets relevant to Apache Spark. We also evaluated the performance of the model.

This gets us ready for the next chapter, which will cover Streaming Analytics using Spark. Let's jump right in.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset