How it works...

In step 1, we loaded the MNIST dataset. The x data was a 3D array of grayscale values of the form (images, width, height). In step 2, we flattened these 28x28 images into a vector of length 784. Then, we normalized the grayscale values between 0 and 1. In step 3, we one-hot encoded the target variable using the to_categorical() function from keras to convert this into a binary format matrix.

In step 4, we built a sequential model by stacking dense and dropout layers. In a dense layer, every neuron receives input from all the neurons of the previous layer, which is why it's known as being densely connected. In our model, each layer took input from the previous layer and applied an activation to the output of our previous layer. We used the relu activation function in the hidden layers and the softmax activation function in the last layer since we had 10 possible outcomes. Dropout layers are used for regularizing deep learning models. Dropout refers to the process of not considering certain neurons in the training phase during a particular forward or backward pass in order to prevent overfitting. The summary() function provides us with a summary of the model; it gives us information about each layer, such as the shape of the output and the parameters in each layer.

In step 5, we compiled the model using the compile() function from keras. We applied the rmsprop() optimizer to find the weights and biases that minimize our objective loss function, categorical_crossentropy. The metrics argument calculates the metric to be evaluated by the model during training.

In step 6, we trained our model for a fixed number of iterations, which is defined by the epochs argument. The validation_split argument can take float values between 0 and 1 and specifies the fraction of the data to be used as validation data. Finally, batch_size defines the number of samples that will be propagated through the network. The history object records the training metrics for each epoch and contains two lists, params and metrics. The params contains the model's parameters, such as batch size, steps, and so on, while metrics contains model metrics such as loss and accuracy.

In step 7, we visualized the model's accuracy and loss metrics. In step 8, we used our model to generate predictions for the test data using the predict_classes() function. Lastly, we evaluated the model's accuracy on the test data using the evaluate() function.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...