This concludes the description and implementation of linear and logistic regression and the concept of regularization to reduce overfitting. Your first analytical projects using machine learning will (or did) likely involve a regression model of some type. Regression models, along with the Naïve Bayes classification, are the most understood techniques for those without a deep knowledge of statistics or machine learning.
At the completion of this chapter, you hopefully have a grasp on the following:
The logistic regression is also the foundation of the conditional random fields introduced in the next chapter and artificial neural networks in Chapter 9, Artificial Neural Networks.
Contrary to the Naïve Bayes models (refer to Chapter 5, Naïve Bayes Classifiers), the least squares or logistic regression does not impose the condition that the features have to be independent. However, the regression models do not take into account the sequential nature of a time series such as asset pricing. The next chapter, Chapter 7, Sequential Data Models, describes two classifiers that take into account the time dependency in a time series.