Part 2. Modeling methods

In part 1, we discussed the initial stages of a data science project. After you’ve defined more precisely the questions you want to answer and the scope of the problem you want to solve, it’s time to analyze the data and find the answers. In part 2, we work with powerful modeling methods from statistics and machine learning.

Chapter 5 covers how to identify appropriate modeling methods to address your specific business problem. It also discusses how to evaluate the quality and effectiveness of models that you or others have discovered. The remaining chapters in part 2 cover specific modeling techniques.

Chapter 6 covers what we call memorization-based techniques. These methods make predictions based primarily on summary statistics of your data. We cover lookup tables, nearest-neighbor methods, Naive Bayes classification, and decision trees. Chapter 7 covers methods that fit simple functions with additive functional structure: linear and logistic regression. These two methods not only make predictions, but also provide you with information about the relationship between the input variables and the outcome.

Chapter 8 covers unsupervised methods: clustering and association rule mining. Unsupervised methods don’t make explicit outcome predictions; they discover relationships and hidden structure in the data. Chapter 9 touches on some more advanced modeling algorithms. We discuss bagged decision trees and random forests, generalized additive models, kernels, and support vector machines.

We work through every method that we cover with a specific data science problem, and a nontrivial dataset. In each chapter, we also discuss additional model evaluation procedures that are specific to the methods that we cover.

On completing part 2, you’ll be familiar with the most popular modeling methods, and you’ll have a sense of which methods are most appropriate for answering different types of questions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset