This chapter illustrates popular machine learning algorithms with examples. A brief introduction to linear and logistic regression was discussed. Using the college acceptance criteria for linear regression and the Titanic survivors for logistic regression, this chapter also illustrated how you can use the statsmodels.formula.api
, pandas
, and sklearn.linear_model
packages for these regression methods. In both these examples, matplotlib
has been used for visualization methods.
You learned about decision trees. Using the sports example (golf and tennis), we looked at the decision tree using the sklearn
and pydot
packages. Further, we discussed Bayes theorem and the Naïve Bayes classifier. Using the TextBlob
package and the movie reviews data from the nltk
corpora, we looked at the example of a word cloud visually using the wordcloud
package.
You learned about the k-nearest neighbors algorithm. Here, we looked at an example that classified fruits based on their weight and shape, visually separating them by their color.
We also looked at the illustration of SVM in its simplest form with an example of how to generate data from the sklearn.svm
package and plotted the results using the matplotlib
library. You learned about PCA, how to determine the redundancy, and eliminate some of the variables. We used the iris example with the sklearn.preprocesing
library to see how to visualize results. Finally, we looked at k-means clustering with an example of random points using sklearn.cluster
as it is the simplest way you can achieve clustering (with minimal code). In the next chapter, we will discuss various examples of bioinformatics, genetics, and network.