Summary

This is the author speaking here. Did you solve the mystery of the revenues drop? You did not, I guess. Nevertheless, you made some relevant steps on your journey to learning how to use R for data mining activities.

In this chapter, you learned some conceptual and some practical stuff, and you now possess medium-level skills to define and measure the performance of data mining models.

Andy first explained to you what we do intend for model performance and how this concept is related to the one of model interpretability and the purposes for which the model was estimated.

You then learned what the main model metrics are for both regression and classification problems.

Firstly, you were introduced to the relevant concepts of error, mean squared errors, and R-squared.

About this latter statistics, you also carefully analyzed its meaning and the common misconceptions regarding it. I strongly advise you to carefully hold these misconceptions in your mind, since in your everyday professional practice in the data mining field the R-squared will be often employed. Knowing its exact meaning will help you avoid potentially painful errors.

From an operational point of view, you learned:

How to compute mean squared error in R through the residual object available as a result of every lm() and lm-like function
Where to find the R-squared parameter and the adjusted R-squared, computed to make comparable R-squared values computed on models with different numbers of explanatory variables

Regarding classification problems, you learned what a confusion matrix is and how to compute it leveraging the table() function. Concepts such as true positive and true negative naturally descend from the discussion around this matrix.

You then looked at some of the performance statistics that can be drawn from the confusion matrix:

Accuracy, which measures the overall performance of the model, measuring how many times its predictions were right
Sensitivity, which measures how good the model is at predicting a positive outcome
Specificity, which measures how good the model is at predicting a negative outcome

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Table of Contents for
Summary