Multi-class cross entropy loss

Multi-class cross entropy loss is used in multi-class classification, such as the MNIST digits classification problem from Chapter 2, Deep Learning and Convolutional Neural Networks. Like above we use the cross entropy function which after a few calculations we obtain the multi-class cross-entropy loss L for each training example being:

Here, is 0 or 1, indicating whether class label is the correct classification for predicting . To use this loss, we first need to add a softmax activation to the output of the final FC layer in our model. The combined cross-entropy with softmax looks like this:

It is useful to know that the name for the raw output of our model is logits. Logits are what is passed to the softmax function. The softmax function is the multi-class version of the sigmoid function. Once it is passed through the softmax function, we can use our multi-class cross entropy loss. TensorFlow actually combines all these steps together into one operation, as shown:

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=model_logits, labels=labels_in))

We must use tf.reduce_mean, as we will get a loss value for each image in our batch. We use tf.reduce_mean to get the average loss for our batch.

We could have used the tf.losses module again, specifically tf.losses.softmax_cross_entropy , similar to above and then we wouldn't need the tf.reduce_mean but we decided to show you a different way just so you can see there is many ways to do the same thing in TensorFlow. As TensorFlow has grown so has the number of different ways of achieving the same outcome and no way is usually much worse than others.

Table of Contents for Multi-class cross entropy loss

Create new playlist

Sign In

Sign Up

Table of Contents for
Multi-class cross entropy loss