Analysis of transfer values

In this section, we will do some analysis of the transferred values that we just got for the training images. The purpose of this analysis is to see whether these transfer values will be enough for classifying the images that we have in CIFAR-10 or not.

We have 2,048 transfer values for each input image. In order to plot these transfer values and do further analysis on them, we can use dimensionality reduction techniques such as Principal Component Analysis (PCA) from scikit-learn. We'll reduce the transfer values from 2,048 to 2  to be able to visualize it and see if they will be good features for discriminating between different categories of CIFAR-10:

from sklearn.decomposition import PCA

Next up, we need to create a PCA object wherein the number of components is only 2:

pca_obj = PCA(n_components=2)

It takes a lot of time to reduce the transfer values from 2,048 to 2, so we are going to subset only 3,000 out of the 5,000 images that we have transfer values for:

subset_transferValues = transfer_values_training[0:3000]

We need to get the class numbers of these images as well:

cls_integers = testing_cls_integers[0:3000]

We can double-check our subsetting by printing the shape of the transfer values:

subset_transferValues.shape

Output:

(3000, 2048)

Next up, we use our PCA object to reduce the transfer values from 2,048 to just 2:

reduced_transferValues = pca_obj.fit_transform(subset_transferValues)

Now, let's see the output of the PCA reduction process:

reduced_transferValues.shape
Output:
(3000, 2)

After reducing the dimensionality of the transfer values to only 2, let's plot these values:

#Importing the color map for plotting each class with different color.
import matplotlib.cm as color_map

def plot_reduced_transferValues(transferValues, cls_integers):

# Create a color-map with a different color for each class.
c_map = color_map.rainbow(np.linspace(0.0, 1.0, num_classes))

# Getting the color for each sample.
colors = c_map[cls_integers]

# Getting the x and y values.
x_val = transferValues[:, 0]
y_val = transferValues[:, 1]

# Plot the transfer values in a scatter plot
plt.scatter(x_val, y_val, color=colors)
plt.show()

Here, we are plotting the reduced transfer values of the subset from the training set. We have 10 classes in CIFAR-10, so we are going to plot their corresponding transfer values with different colors. As you can see from the following graph, the transfer values are grouped according to the corresponding class. The overlap between groups is because the reduction process of PCA can't properly separate the transfer values:

plot_reduced_transferValues(reduced_transferValues, cls_integers)

Figure 10.9: Transfer values reduced using PCA

We can do a further analysis on our transfer values using a different dimensionality reduction method called t-SNE:

from sklearn.manifold import TSNE

Again, we'll be reduce our dimensionality of the transfer values, which is 2,048, but this time to 50 values and not 2:

pca_obj = PCA(n_components=50)
transferValues_50d = pca_obj.fit_transform(subset_transferValues)

Next up, we stack the second dimensionality reduction technique and feed the output of the PCA process to it:

tsne_obj = TSNE(n_components=2)

Finally, we use the reduced values from the PCA method and apply the t-SNE method to it:

reduced_transferValues = tsne_obj.fit_transform(transferValues_50d) 

And double-check if it has the correct shape:

reduced_transferValues.shape

Output:

(3000, 2)

Let's plot the reduced transfer values by the t-SNE method. As you can see in the next image, the t-SNE has been able to do better separation of grouped transfer values than the PCA one.

The takeaway from this analysis is that the extracted transfer values we got by feeding our input images to the pre-trained inception model can be used to separate training images into the 10 classes. This separation won't be 100% accurate because of the small overlap in the following graph, but we can get rid of this overlap by doing some fine-tuning on our pre-trained model:

plot_reduced_transferValues(reduced_transferValues, cls_integers)
Figure 10.10: Transfer values reduced using t-SNE

Now we have transfer values extracted from our training images and we know that these values will be able to, to some extent, distinguish between the different classes that CIFAR-10 has. Next, we need to build a linear classifier and feed these transfer values to it to do the actual classification.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset