Importance of predictors

As we have already said, interpretability is a relevant topic when talking about data mining models. If you think about it, decision trees are highly interpretable since you can describe them as a sequence of decisions on the different response variables you perform to come to a final prediction. What about random forests? How would you describe the conjoint effect of 400 random decision trees on the final prediction? From similar concerns, two measures were defined to evaluate the importance of every explanatory variable:

  • The mean decrease in accuracy
  • The Gini index 

As you know, starting from a confusion matrix you can derive a great variety of metrics, such as sensitivity, accuracy, and similar. But how do we establish if this model is a good one? What value of every possible metric is a good value? There is no universal answer; it always depends on the specific problem we are addressing.

For instance, for the problem we are facing, since we are employing different models to solve the same problem, one relevant point could be comparing the performance of all models in terms of classification. We will get back to this when dealing with ensemble learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset