Limitations of PCA and how LDA can help

Being a linear method, PCA has, of course, its limitations when we are faced with data that has nonlinear relationships. We won't go into details here, but it's sufficient to say that there are extensions of PCA, for example, Kernel PCA, that introduce nonlinear transformations so that we can still use the PCA approach.

Another interesting weakness of PCA is when it's being applied to special classification problems. If we replace good = (x1 > 5) | (x2 > 5) with good = x1 > x2 to simulate such a special case, we can quickly see the problem, as can be seen in the following diagram:

Here, the classes are not distributed according to the axis with the highest variance, but the axis with the second highest variance. Clearly, PCA falls flat on its face. As we don't provide PCA with any cues regarding class labels, it cannot do any better.

Linear discriminant analysis (LDA) comes to the rescue here. It's a method that tries to maximize the distance of points belonging to different classes while minimizing the distances of points of the same class. We won't give any more details regarding how exactly the underlying theory works, just a quick tutorial on how to use it, as shown in the following code:

>>> from sklearn import lda
>>> lda_inst = lda.LDA(n_components=1)
>>> Xtrans = lda_inst.fit_transform(X, good)

That's all. Note that, in contrast to the previous PCA example, we provide class labels to the fit_transform() method. Hence, PCA is an unsupervised feature projection method, whereas LDA is a supervised one. The result looks as expected:

So, why consider PCA at all? Why not simply use LDA? Well, it's not that simple. With an increasing number of classes and fewer samples per class, LDA does not look that good any more. Also, PCA seems to be not as sensitive to different training sets as LDA. So, when we have to advise which method to use, we can only say it depends.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset