Linear discriminant analysis

Strictly speaking, linear discriminant analysis (LDA) is a classifier (a classical statistical method developed by Ronald Fisher, the father of modern statistics), but it is often used for dimensionality reduction. It doesn't scale so well to larger datasets (like many statistical methods), but it's something to be tried, which could bring better results than other classification methods such as logistic regression. Since it's a supervised approach, it requires the label set to optimize the reduction step. LDA outputs linear combinations of the input features, trying to model the difference between the classes that best discriminate them (since LDA uses label information). Compared to PCA, the output dataset that is obtained with the help of LDA contains a neat distinction between classes. However, it cannot be used in regression problems, since it is derived from a classification process.

Here's the application of LDA on the Iris dataset:

In: from sklearn.lda import LDA
lda_2c = LDA(n_components=2)
X_lda_2c = lda_2c.fit_transform(iris.data, iris.target)
plt.scatter(X_lda_2c[:,0], X_lda_2c[:,1],
c=iris.target, alpha=0.8, edgecolors='none')
plt.show()

This scatterplot is derived from the first two components generated by the LDA:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset