Training loss

Next up is the training loss. We get the logits and targets and calculate the softmax cross-entropy loss. First, we need to one-hot encode the targets; we're getting them as encoded characters. Then, we reshape the one-hot targets, so it's a 2D tensor with size (M * N) × C, where C is the number of classes/characters we have. Remember that we reshaped the LSTM outputs and ran them through a fully connected layer with C units. So, our logits will also have size (M * N) × C.

Then, we run the logits and targets through tf.nn.softmax_cross_entropy_with_logits and find the mean to get the loss:

def model_loss(logits, targets, lstm_size, num_classes):
    
    # convert the targets to one-hot encoded and reshape them to match the logits, one row per batch_size per step
    output_y_one_hot = tf.one_hot(targets, num_classes)
    output_y_reshaped = tf.reshape(output_y_one_hot, logits.get_shape())
    
    #Use the cross entropy loss
    model_loss = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=output_y_reshaped)
    model_loss = tf.reduce_mean(model_loss)
    return model_loss

Table of Contents for Training loss

Create new playlist

Sign In

Sign Up

Table of Contents for
Training loss