Principal component analysis

Principal component analysis (PCA) is often the first thing to try out if you want to cut down the number of features and do not know which feature projection method to use. PCA is limited as it's a linear method, but chances are that it already goes far enough for your model to learn well enough. Add to this the strong mathematical properties it offers, the speed at which it finds the transformed feature space, and the speed at which it is later able to transform between original and transformed features, and we can almost guarantee that it will also become one of your frequently used machine learning tools.

To summarize, given the original feature space, PCA finds a linear projection of itself in a lower dimensional space that has the following properties:

  • The conserved variance is maximized
  • The final reconstruction error (when trying to go back from transformed features to the original ones) is minimized

As PCA simply transforms the input data, it can be applied to both classification and regression problems. In this section, we will use a classification task to discuss the method.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset