Summary

This ends our rather quick tour of machine learning and the best practice that needs to be followed. Although we have tried to cover some of the most basic things to remember, suitable data often beats better algorithms and better demand. Most importantly, to design good features out of your data might take a long time; however, it would very much aid you. However, if you have a large-scale dataset to be applied to your machine learning algorithms or model, whichever classification, clustering, or regression algorithm you use might not be a matter of fact concerning the machine learning classes and their respective classification performance.

Therefore, it would be a wise decision to choose an appropriate machine learning algorithm that can fulfill requirements such as speed, memory usage, throughput, scalability, or usability. In addition to going over what we said in the sections above, if you are really concerned about achieving the accuracy, you should undoubtedly try a group of different classifiers to find the best one using the cross-validation technique or just use an ensemble method to choose them alltogether.

You can also be motivated and take a lesson from the Netflix Prize PLUS. We spoke at length about the Spark machine learning APIs, some best practice in ML application development, machine learning tasks and classes, some widely used best practices, and so on. However, we have not shown in depth analysis of the machine learning techniques. We intend to talk about this in more detail in Chapter 4, Extracting Knowledge through Feature Engineering.

In the next chapter, we will cover in detail the DataFrame, Dataset, and Resilient Distributed Dataset (RDD) APIs for working with structured data targeting to provide a basic understanding of machine learning problems with the available data. Therefore, at the end, you will be able to apply from basic to complex data manipulation with ease.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset