How it works...

In the Getting ready section, we imported our required libraries. Note that we've imported the TensorFlow library. We can directly access the datasets by importing the tf.keras.datasets module. This module comes with various built-in datasets, including the following:

boston_housing: Boston housing price regression dataset
cifar10: CIFAR10 small images classification dataset
fashion_mnist: Fashion-MNIST dataset
imdb: IMDB sentiment classification dataset
mnist: MNIST handwritten digits dataset
reuters: Reuters topic classification dataset

We used the fashion_mnist dataset from this module. We loaded the pre-shuffled train and test data and checked the shape of the train and test subsets.

We noticed, in the Getting ready section, that the shape of the training subset is (60000, 28, 28), which means that we have 60,000 images that are of 28 X 28 pixel in size.

We checked the distinct levels in the target variable with the unique() method. We saw that there were 10 classes from 0 to 9.

We also took a quick look at some of the images. We defined the number of columns and rows that we required. Running an iteration, we plotted the images with matplotlib.pyplot.imshow() in grayscale. We also printed the actual class labels against each of the images using matplotlib.pyplot.title().

In the How to do it... section, in Step 1, we created multiple homogeneous models using the tf.keras module. In each iteration, we used the resample() method to create bootstrap samples. We passed replace=True to the resample() method to ensure that we have samples with replacement.

In this step, we also defined the model architecture. We added layers to the model using tf.keras.layers. In each layer, we defined the number of units.

"Model architecture" refers to the overall neural network structure, which includes groups of units called layers. These layers are arranged in a chain-like structure. Each layer is a function of its previous layer. Determining the model architecture is key to neural networks.

We ran through a few iterations in our example. We set the number of iterations. In each iteration, we compiled the model and fit it to our training data. We made predictions on our test data and captured the following metrics in a DataFrame:

Accuracy
Precision
Recall

We've used Rectified Linear Units (RELU) as the activation function for the hidden layers. ReLU is represented by the f(x) = max{0, x}. In neural networks, ReLU is recommended as the default activation function.

Note that, in the last layer of the model architecture, we've used softmax as the activation function. The softmax function can be considered a generalization of the sigmoid function. While the sigmoid function is used to represent a probability distribution of a dichotomous variable, the softmax function is used to represent a probability distribution of a target variable with more than two classes. When the softmax function is used for multi-class classification, it returns a probability value between 0 and 1 for each class. The sum of all probabilities will be equal to one.

In Step 2, we checked the structure of the accuracy DataFrame that we created in Step 1. We noticed that we had three columns for accuracy, precision, and recall and the metrics for each iteration were captured. In Step 3, we converted the datatypes in the DataFrame into an integer.

In Step 4, we performed max-voting using stats.mode() for each observation. Since we ran seven iterations, we had seven predictions for each observation. stats.mode() returned the prediction with the maximum occurrence.

In Step 5, we checked the accuracy of the model with the max-voted predictions. In Step 6 and Step 7, we generated the confusion matrix to visualize the correct predictions. The diagonal elements in the plot were the correct predictions, while the off-diagonal elements were the misclassifications. We saw that there was a higher number of correct classifications compared to misclassifications.

In Step 8 and Step 9, we proceeded to create a structure to hold the performance metrics (accuracy, precision, and recall), along with the labels for each iteration and the ensemble. We used this structure to plot our charts for the performance metrics.

In Step 10, we plotted the accuracy for each iteration and the max-voted predictions. Similarly, in Step 11, we plotted the precision and recall for each iteration and the max-voted predictions.

From the plots we generated in Step 10 and Step 11, we noticed how the accuracy, precision, and recall improved for the max-voted predictions.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...