Dimensionality reduction

Dimensionality reduction is the process of converting a set of data with many variables into data with lesser dimensions but ensuring similar information. It can help improve model accuracy and performance, improve interpretability, and prevent overfitting. The Statistics and Machine Learning Toolbox includes many algorithms and functions for reducing the dimensionality of our datasets. It can be divided into feature selection and feature extraction. Feature selection approaches try to find a subset of the original variables. Feature extraction reduces the dimensionality in the data by transforming data into new features.

As already mentioned, feature selection finds only the subset of measured features (predictor variables) that give the best predictive performance in modeling the data. The Statistics and Machine Learning Toolbox includes many feature selection methods, as follows:

  • Stepwise regression: Adds or removes features until there is no improvement in prediction accuracy. Especially suited for linear regression or generalized linear regression algorithms.
  • Sequential feature selection: Equivalent to stepwise regression, this can be applied with any supervised learning algorithm.
  • Selecting features for classifying high-dimensional data.
  • Boosted and bagged decision trees: Calculate the variable's importance from out-of-bag errors.
  • Regularization: Remove redundant features by reducing their weights to zero.

Otherwise, feature extraction transforms existing features into new features (predictor variables) where less-descriptive features can be ignored.

The Statistics and Machine Learning Toolbox includes many feature extraction methods, as follows:

  • PCA: This can be applied to summarize data in fewer dimensions by projection onto a unique orthogonal basis
  • Non-negative matrix factorization: This can be applied when model terms must represent non-negative quantities
  • Factor analysis: This can be applied to build explanatory models of data correlations

The following are step-wise regression example charts:

Figure 1.20: Step-wise regression example
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset