Decomposition to classify with DictionaryLearning

In this recipe, we'll show how a decomposition method can actually be used for classification. DictionaryLearning attempts to take a dataset and transform it into a sparse representation.

Getting ready

With DictionaryLearning, the idea is that the features are a basis for the resulting datasets. In an effort to keep this recipe short, I'll assume you have idis_data and iris_target ready to go.

How to do it...

First, import DictionaryLearning:

>>> from sklearn.decomposition import DictionaryLearning

Next, use three components to represent the three species of iris:

>>> dl = DictionaryLearning(3)

Then transform every other data point so that we can test the classifier on the resulting data points after the learner is trained:

>>> transformed = dl.fit_transform(iris_data[::2])
>>> transformed[:5]
array([[ 0.        ,  6.34476574,  0.        ], 
       [ 0.        ,  5.83576461,  0.        ], 
       [ 0.        ,  6.32038375,  0.        ], 
       [ 0.        ,  5.89318572,  0.        ], 
       [ 0.        ,  5.45222715,  0.        ]])

We can visualize the output. Notice how each value is sited on the x, y, or z axis along with the other values and 0; this is called sparseness.

How to do it...

If you look closely, you can see there was some training error. One of the classes was misclassified. Only being wrong once isn't a big deal, though.

Next, let's fit (not fit_transform) the testing set:

>>> transformed = dl.transform(iris_data[1::2])

The following screenshot shows its performance:

How to do it...

Notice again that there was some error in the classification. If you remember some of the other visualizations, the blue and green classes were the two classes that often appeared close together.

How it works...

DictionaryLearning has a background in signal processing and neurology. The idea is that only few features can be active at any given time. Therefore, DictionaryLearning attempts to find a suitable representation for the underlying data, given the constraint that most of the features should be 0.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset