What this book covers

Here is a list of changes compared with the second edition by chapter.

Chapter 1, Preparing and Understanding Data, covers the loading of data and demonstrates how to obtain an understanding of its structure and dimensions, as well as how to install the necessary packages.

Chapter 2, Linear Regression, contains improved code, and superior charts have been provided; other than that, it remains relatively close to the original.

Chapter 3, Logistic Regression, contains improved and streamlined code. One of my favorite techniques, multivariate adaptive regression splines, has been added. This technique performs well, handles non-linearity, and is easy to explain. It is my base model.

Chapter 4, Advanced Feature Selection in Linear Models, includes techniques not only for regression, but also for a classification problem.

Chapter 5, K-Nearest Neighbors and Support Vector Machines, includes streamlined and simplified code.

Chapter 6, Tree-Based Classification, is augmented by the addition of the very popular techniques provided by the XGBOOST package. Additionally, the technique of using a random forest as a feature selection tool is incorporated.

Chapter 7, Neural Networks and Deep Learning, has been updated with additional information on deep learning methods and includes improved code for the H2O package, including hyperparameter search.

Chapter 8, Creating Ensembles and Multiclass Methods, has completely new content, involving the utilization of several great packages.

Chapter 9, Cluster Analysis, includes the methodology for executing unsupervised learning with random forests added.

Chapter 10, Principal Component Analysis, uses a different dataset, while an out-of-sample prediction has been added.

Chapter 11, Association Analysis, explains association analysis, and applies not only to making recommendations, product placement, and promotional pricing, but can also be used in manufacturing, web usage, and healthcare.

Chapter 12, Time Series and Causality, includes a couple of additional years of climate data, along with a demonstration of different causality test methods.

Chapter 13, Text Mining, includes additional data and improved code.

Appendix, Creating a Package, includes additional data packages.

Table of Contents for What this book covers

Create new playlist

Sign In

Sign Up

Table of Contents for
What this book covers