Let's kick off the training process by creating a session variable that will be responsible for executing the computational graph that we defined earlier:
session = tf.Session()
Also, we need to initialize the variables that we have defined so far:
session.run(tf.global_variables_initializer())
We are going to feed the images in batches to avoid an out-of-memory error:
train_batch_size = 64
Before kicking the training process, we are going to define a helper function that will perform the optimization process by iterating through the training batches:
# number of optimization iterations performed so far
total_iterations = 0
def optimize(num_iterations):
# Update globally the total number of iterations performed so far.
global total_iterations
for i in range(total_iterations,
total_iterations + num_iterations):
# Generating a random batch for the training process
# input_batch now contains a bunch of images from the training set and
# y_actual_batch are the actual labels for the images in the input batch.
input_batch, y_actual_batch = mnist_data.train.next_batch(train_batch_size)
# Putting the previous values in a dict format for Tensorflow to automatically assign them to the input
# placeholders that we defined above
feed_dict = {input_values: input_batch,
y_actual: y_actual_batch}
# Next up, we run the model optimizer on this batch of images
session.run(model_optimizer, feed_dict=feed_dict)
# Print the training status every 100 iterations.
if i % 100 == 0:
# measuring the accuracy over the training set.
acc_training_set = session.run(model_accuracy, feed_dict=feed_dict)
#Printing the accuracy over the training set
print("Iteration: {0:>6}, Accuracy Over the training set: {1:>6.1%}".format(i + 1, acc_training_set))
# Update the number of iterations performed so far
total_iterations += num_iterations
And we'll define some helper functions to help us visualize the results of the model and to see which images are misclassified by the model:
def plot_errors(cls_predicted, correct):
# cls_predicted is an array of the predicted class number of each image in the test set.
# Extracting the incorrect images.
incorrect = (correct == False)
# Get the images from the test-set that have been
# incorrectly classified.
images = mnist_data.test.images[incorrect]
# Get the predicted classes for those incorrect images.
cls_pred = cls_predicted[incorrect]
# Get the actual classes for those incorrect images.
cls_true = mnist_data.test.cls_integer[incorrect]
# Plot 9 of these images
plot_imgs(imgs=imgs[0:9],
cls_actual=cls_actual[0:9],
cls_predicted=cls_predicted[0:9])
We can also plot the confusion matrix of the predicted results compared to the actual true classes:
def plot_confusionMatrix(cls_predicted):
# cls_predicted is an array of the predicted class number of each image in the test set.
# Get the actual classes for the test-set.
cls_actual = mnist_data.test.cls_integer
# Generate the confusion matrix using sklearn.
conf_matrix = confusion_matrix(y_true=cls_actual,
y_pred=cls_predicted)
# Print the matrix.
print(conf_matrix)
# visualizing the confusion matrix.
plt.matshow(conf_matrix)
plt.colorbar()
tick_marks = np.arange(num_classes)
plt.xticks(tick_marks, range(num_classes))
plt.yticks(tick_marks, range(num_classes))
plt.xlabel('Predicted class')
plt.ylabel('True class')
# Showing the plot
plt.show()
Finally, we are going to define a helper function to help us measure the accuracy of the trained model over the test set:
# measuring the accuracy of the trained model over the test set by splitting it into small batches
test_batch_size = 256
def test_accuracy(show_errors=False,
show_confusionMatrix=False):
#number of test images
number_test = len(mnist_data.test.images)
# define an array of zeros for the predicted classes of the test set which
# will be measured in mini batches and stored it.
cls_predicted = np.zeros(shape=number_test, dtype=np.int)
# measuring the predicted classes for the testing batches.
# Starting by the batch at index 0.
i = 0
while i < number_test:
# The ending index for the next batch to be processed is j.
j = min(i + test_batch_size, number_test)
# Getting all the images form the test set between the start and end indices
input_images = mnist_data.test.images[i:j, :]
# Get the acutal labels for those images.
actual_labels = mnist_data.test.labels[i:j, :]
# Create a feed-dict with the corresponding values for the input placeholder values
feed_dict = {input_values: input_images,
y_actual: actual_labels}
cls_predicted[i:j] = session.run(y_predicted_cls_integer, feed_dict=feed_dict)
# Setting the start of the next batch to be the end of the one that we just processed j
i = j
# Get the actual class numbers of the test images.
cls_actual = mnist_data.test.cls_integer
# Check if the model predictions are correct or not
correct = (cls_actual == cls_predicted)
# Summing up the correct examples
correct_number_images = correct.sum()
# measuring the accuracy by dividing the correclty classified ones with total number of images in the test set.
testset_accuracy = float(correct_number_images) / number_test
# showing the accuracy.
print("Accuracy on Test-Set: {0:.1%} ({1} / {2})".format(testset_accuracy, correct_number_images, number_test))
# showing some examples form the incorrect ones.
if show_errors:
print("Example errors:")
plot_errors(cls_predicted=cls_predicted, correct=correct)
# Showing the confusion matrix of the test set predictions
if show_confusionMatrix:
print("Confusion Matrix:")
plot_confusionMatrix(cls_predicted=cls_predicted)
Let's print the accuracy of the created model over the test set without doing any optimization:
test_accuracy()
Let's get a sense of the optimization process actually enhancing the model capability to classify images to their correct class by running the optimization process for one iteration:
optimize(num_iterations=1)
Output:
Iteration: 1, Accuracy Over the training set: 4.7%
test_accuracy()
Output
Accuracy on Test-Set: 4.4% (437 / 10000)
Now, let's get down to business and kick off a long optimization process of 10,000 iterations:
optimize(num_iterations=9999) #We have already performed 1 iteration.
At the end of the output, you should be getting something very close to the following output:
Iteration: 7301, Accuracy Over the training set: 96.9%
Iteration: 7401, Accuracy Over the training set: 100.0%
Iteration: 7501, Accuracy Over the training set: 98.4%
Iteration: 7601, Accuracy Over the training set: 98.4%
Iteration: 7701, Accuracy Over the training set: 96.9%
Iteration: 7801, Accuracy Over the training set: 96.9%
Iteration: 7901, Accuracy Over the training set: 100.0%
Iteration: 8001, Accuracy Over the training set: 98.4%
Iteration: 8101, Accuracy Over the training set: 96.9%
Iteration: 8201, Accuracy Over the training set: 100.0%
Iteration: 8301, Accuracy Over the training set: 98.4%
Iteration: 8401, Accuracy Over the training set: 98.4%
Iteration: 8501, Accuracy Over the training set: 96.9%
Iteration: 8601, Accuracy Over the training set: 100.0%
Iteration: 8701, Accuracy Over the training set: 98.4%
Iteration: 8801, Accuracy Over the training set: 100.0%
Iteration: 8901, Accuracy Over the training set: 98.4%
Iteration: 9001, Accuracy Over the training set: 100.0%
Iteration: 9101, Accuracy Over the training set: 96.9%
Iteration: 9201, Accuracy Over the training set: 98.4%
Iteration: 9301, Accuracy Over the training set: 98.4%
Iteration: 9401, Accuracy Over the training set: 100.0%
Iteration: 9501, Accuracy Over the training set: 100.0%
Iteration: 9601, Accuracy Over the training set: 98.4%
Iteration: 9701, Accuracy Over the training set: 100.0%
Iteration: 9801, Accuracy Over the training set: 100.0%
Iteration: 9901, Accuracy Over the training set: 100.0%
Iteration: 10001, Accuracy Over the training set: 98.4%
Now, let's check how the model will generalize over the test:
test_accuracy(show_errors=True,
show_confusionMatrix=True)
It was interesting that we actually got almost 93% accuracy over the test while using a basic convolution network. This implementation and the results show you what a simple convolution network can do.