Summary

In this chapter, we looked at using probabilistic linear models to predict a qualitative response with two generalized linear model methods: logistic regression, and multivariate adaptive regression splines. We explored using the weight of information and information value as a technique to do univariate feature selection. We covered the concept of finding the proper probability threshold to minimize classification error. Additionally, we began the process of using various performance metrics such as AUC, log-loss, and ROC charts to explore model selection visually and statistically. These metrics proved to be more informative than just pure accuracy, especially in a situation where class labels are highly imbalanced. In the next chapter, we'll cover regularization methods for feature selection, and how it can be used in training your algorithms. We'll see how we can create a dataset. We'll know about ridge regression and dive deeper in feature selection.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset