Model training

Let's kick off the training process by creating a session variable that will be responsible for executing the computational graph that we defined earlier:

session = tf.Session()

Also, we need to initialize the variables that we have defined so far:

session.run(tf.global_variables_initializer())

We are going to feed the images in batches to avoid an out-of-memory error:

train_batch_size = 64

Before kicking the training process, we are going to define a helper function that will perform the optimization process by iterating through the training batches:

# number of optimization iterations performed so far
total_iterations = 0

def optimize(num_iterations):
# Update globally the total number of iterations performed so far.
global total_iterations


for i in range(total_iterations,
total_iterations + num_iterations):

# Generating a random batch for the training process
# input_batch now contains a bunch of images from the training set and
# y_actual_batch are the actual labels for the images in the input batch.
input_batch, y_actual_batch = mnist_data.train.next_batch(train_batch_size)

# Putting the previous values in a dict format for Tensorflow to automatically assign them to the input
# placeholders that we defined above
feed_dict = {input_values: input_batch,
y_actual: y_actual_batch}

# Next up, we run the model optimizer on this batch of images
session.run(model_optimizer, feed_dict=feed_dict)

# Print the training status every 100 iterations.
if i % 100 == 0:
# measuring the accuracy over the training set.
acc_training_set = session.run(model_accuracy, feed_dict=feed_dict)

#Printing the accuracy over the training set
print("Iteration: {0:>6}, Accuracy Over the training set: {1:>6.1%}".format(i + 1, acc_training_set))

# Update the number of iterations performed so far
total_iterations += num_iterations

And we'll define some helper functions to help us visualize the results of the model and to see which images are misclassified by the model:

def plot_errors(cls_predicted, correct):

# cls_predicted is an array of the predicted class number of each image in the test set.


# Extracting the incorrect images.
incorrect = (correct == False)

# Get the images from the test-set that have been
# incorrectly classified.
images = mnist_data.test.images[incorrect]

# Get the predicted classes for those incorrect images.
cls_pred = cls_predicted[incorrect]

# Get the actual classes for those incorrect images.
cls_true = mnist_data.test.cls_integer[incorrect]

# Plot 9 of these images
plot_imgs(imgs=imgs[0:9],
cls_actual=cls_actual[0:9],
cls_predicted=cls_predicted[0:9])

We can also plot the confusion matrix of the predicted results compared to the actual true classes:

def plot_confusionMatrix(cls_predicted):

# cls_predicted is an array of the predicted class number of each image in the test set.

# Get the actual classes for the test-set.
cls_actual = mnist_data.test.cls_integer

# Generate the confusion matrix using sklearn.
conf_matrix = confusion_matrix(y_true=cls_actual,
y_pred=cls_predicted)

# Print the matrix.
print(conf_matrix)

# visualizing the confusion matrix.
plt.matshow(conf_matrix)

plt.colorbar()
tick_marks = np.arange(num_classes)
plt.xticks(tick_marks, range(num_classes))
plt.yticks(tick_marks, range(num_classes))
plt.xlabel('Predicted class')
plt.ylabel('True class')

# Showing the plot
plt.show()

Finally, we are going to define a helper function to help us measure the accuracy of the trained model over the test set:

# measuring the accuracy of the trained model over the test set by splitting it into small batches
test_batch_size = 256

def test_accuracy(show_errors=False,
show_confusionMatrix=False):

#number of test images
number_test = len(mnist_data.test.images)

# define an array of zeros for the predicted classes of the test set which
# will be measured in mini batches and stored it.
cls_predicted = np.zeros(shape=number_test, dtype=np.int)

# measuring the predicted classes for the testing batches.

# Starting by the batch at index 0.
i = 0

while i < number_test:
# The ending index for the next batch to be processed is j.
j = min(i + test_batch_size, number_test)

# Getting all the images form the test set between the start and end indices
input_images = mnist_data.test.images[i:j, :]

# Get the acutal labels for those images.
actual_labels = mnist_data.test.labels[i:j, :]

# Create a feed-dict with the corresponding values for the input placeholder values
feed_dict = {input_values: input_images,
y_actual: actual_labels}


cls_predicted[i:j] = session.run(y_predicted_cls_integer, feed_dict=feed_dict)

# Setting the start of the next batch to be the end of the one that we just processed j
i = j

# Get the actual class numbers of the test images.
cls_actual = mnist_data.test.cls_integer

# Check if the model predictions are correct or not
correct = (cls_actual == cls_predicted)

# Summing up the correct examples
correct_number_images = correct.sum()

# measuring the accuracy by dividing the correclty classified ones with total number of images in the test set.
testset_accuracy = float(correct_number_images) / number_test

# showing the accuracy.
print("Accuracy on Test-Set: {0:.1%} ({1} / {2})".format(testset_accuracy, correct_number_images, number_test))

# showing some examples form the incorrect ones.
if show_errors:
print("Example errors:")
plot_errors(cls_predicted=cls_predicted, correct=correct)

# Showing the confusion matrix of the test set predictions
if show_confusionMatrix:
print("Confusion Matrix:")
plot_confusionMatrix(cls_predicted=cls_predicted)

Let's print the accuracy of the created model over the test set without doing any optimization:

test_accuracy()
Output:
Accuracy on Test-Set: 4.1% (410 / 10000)

Let's get a sense of the optimization process actually enhancing the model capability to classify images to their correct class by running the optimization process for one iteration:

optimize(num_iterations=1)
Output:
Iteration: 1, Accuracy Over the training set: 4.7%
test_accuracy()
Output
Accuracy on Test-Set: 4.4% (437 / 10000)

Now, let's get down to business and kick off a long optimization process of 10,000 iterations:

optimize(num_iterations=9999) #We have already performed 1 iteration.

At the end of the output, you should be getting something very close to the following output:

Iteration: 7301, Accuracy Over the training set: 96.9%
Iteration: 7401, Accuracy Over the training set: 100.0%
Iteration: 7501, Accuracy Over the training set: 98.4%
Iteration: 7601, Accuracy Over the training set: 98.4%
Iteration: 7701, Accuracy Over the training set: 96.9%
Iteration: 7801, Accuracy Over the training set: 96.9%
Iteration: 7901, Accuracy Over the training set: 100.0%
Iteration: 8001, Accuracy Over the training set: 98.4%
Iteration: 8101, Accuracy Over the training set: 96.9%
Iteration: 8201, Accuracy Over the training set: 100.0%
Iteration: 8301, Accuracy Over the training set: 98.4%
Iteration: 8401, Accuracy Over the training set: 98.4%
Iteration: 8501, Accuracy Over the training set: 96.9%
Iteration: 8601, Accuracy Over the training set: 100.0%
Iteration: 8701, Accuracy Over the training set: 98.4%
Iteration: 8801, Accuracy Over the training set: 100.0%
Iteration: 8901, Accuracy Over the training set: 98.4%
Iteration: 9001, Accuracy Over the training set: 100.0%
Iteration: 9101, Accuracy Over the training set: 96.9%
Iteration: 9201, Accuracy Over the training set: 98.4%
Iteration: 9301, Accuracy Over the training set: 98.4%
Iteration: 9401, Accuracy Over the training set: 100.0%
Iteration: 9501, Accuracy Over the training set: 100.0%
Iteration: 9601, Accuracy Over the training set: 98.4%
Iteration: 9701, Accuracy Over the training set: 100.0%
Iteration: 9801, Accuracy Over the training set: 100.0%
Iteration: 9901, Accuracy Over the training set: 100.0%
Iteration: 10001, Accuracy Over the training set: 98.4%

Now, let's check how the model will generalize over the test:

test_accuracy(show_errors=True,
show_confusionMatrix=True)
Output:
Accuracy on Test-Set: 92.8% (9281 / 10000)
Example errors:

Figure 9.13: Accuracy over the test
Confusion Matrix:
[[ 971    0    2    2    0    4    0    1    0    0]
 [   0 1110    4    2    1    2    3    0   13    0]
 [  12    2  949   15   16    3    4   17   14    0]
 [   5    3   14  932    0   34    0   13    6    3]
 [   1    2    3    0  931    1    8    2    3   31]
 [  12    1    4   13    3  852    2    1    3    1]
 [  21    4    5    2   18   34  871    1    2    0]
 [   1   10   26    5    5    0    0  943    2   36]
 [  16    5   10   27   16   48    5   13  815   19]
 [  12    5    5   11   38   10    0   18    3  907]]

The following is the output:

Figure 9.14: Confusion matrix of the test set.

It was interesting that we actually got almost 93% accuracy over the test while using a basic convolution network. This implementation and the results show you what a simple convolution network can do.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset