This concludes the description and implementation of the linear and logistic regression and the concept of regularization to reduce overfitting. Your first analytical projects using machine learning will (or did) likely involve a regression model of some type. Regression models, along with the Naïve Bayes classification, are the most understood techniques for those without a deep knowledge of statistics or machine learning.
After the completion of this chapter, you will hopefully have a grasp on the following topics:
The logistic regression is also the foundation of the conditional random fields, as described in the Conditional random fields section in Chapter 7, Sequential Data Models, and multilayer perceptrons, which was introduced in the The multilayer perceptron section in Chapter 9, Artificial Neural Networks.
Contrary to the Naïve Bayes models (refer to Chapter 5, Naïve Bayes Classifiers), the least squares or logistic regression does not impose the condition that the features have to be independent. However, the regression models do not take into account the sequential nature of a time series such as asset pricing. The next chapter, which is dedicated to models for sequential data, introduces two classifiers that take into account the time dependency in a time series: the hidden Markov model and conditional random fields.