How to do it...

We'll now move on to training our models:

  1. In the following code block, we'll create multiple homogeneous models over a few iterations using tf.keras:
accuracy = pd.DataFrame( columns=["Accuracy","Precision","Recall"])
predictions = np.zeros(shape=(10000,7))
row_index = 0
for i in range(7):
# bootstrap sampling
boot_train = resample(x_train,y_train,replace=True, n_samples=40000, random_state=None)
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

# compile the model
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model,y_train,epochs=10,batch_size=64)

# Evaluate accuracy
score = model.evaluate(x_test, y_test, batch_size=64)

# Make predictions
model_pred= model.predict(x_test)
pred_classes =model_pred.argmax(axis=-1)
accuracy.loc[row_index, 'Precision'] = precision_score(y_test, pred_classes, average='weighted')
accuracy.loc[row_index, 'Recall'] = recall_score(y_test, pred_classes,average='weighted')

# Save predictions to predictions array
predictions[:,i] = pred_classes


print("Iteration " + str(i+1)+ " Accuracy : " + "{0}".format(score[1]))

We mention seven iterations and 10 epochs in each iteration. In the following screenshot, we can see the progress as the model gets trained:

  1. With the code in Step 1, we collate the accuracy, precision, and recall for every iteration on the test data:

In the following screenshot, we can see how the preceding three metrics change in each iteration:

  1. We'll form a DataFrame with the predictions that are returned by all of the models in each iteration:
# Create dataframe using prediction of each iteration
df_iteration = pd.DataFrame([predictions[:,0],
  1. We convert the type into an integer:
df_iteration = df_iteration.astype('int64')
  1. We perform max-voting to identify the most predicted class for each observation. We simply use mode to find out which class was predicted the most times for an observation:
# find the mode for result
mode = stats.mode(df_iteration)
  1. We calculate the accuracy of the test data:
# calculate the accuracy for test dataset
print(accuracy_score( y_test, mode[0].T))
  1. We generate the confusion matrix with the required labels:
# confusion matrix
cm = confusion_matrix(y_test, mode[0].T, labels=[0, 1, 2, 3, 4, 5, 6, 7, 8])
  1. We plot the confusion matrix:
ax= plt.subplot()

# annot=True to annotate cells
sns.heatmap(cm, annot=True, ax = ax, fmt='g', cmap='Blues')

The confusion matrix plot appears as follows:

  1. We create a DataFrame with all of the iteration numbers:
accuracy["Models"]=["Model 1",
"Model 2",
"Model 3",
"Model 4",
"Model 5",
"Model 6",
"Model 7"]
  1. We then combine the accuracy, precision, and recall in one single table:
"Ensemble Model"]],


accuracy.set_value(7, 'Precision', precision_score(y_test, mode[0].T, average='micro'))
accuracy.set_value(7, 'Recall', recall_score(y_test, mode[0].T, average='micro'))

In the following screenshot, we can see the structure that holds the metrics from each of the models and the ensemble model:

  1. We plot the accuracy returned by each iteration and the accuracy from max-voting:
plt.title("Accuracy across all Iterations and Ensemble")

This gives us the following plot. We notice that the accuracy returned by the max-voting method is the highest compared to individual models:

  1. We also plot the precision and recall for each model and the ensemble:
plt.title("Metrics across all Iterations and models")

This is shown in the following screenshot:

From the preceding screenshot, we notice that the precision and recall improve for an ensemble model.

