Training the model

Now, let's kick off the training process by providing the inputs and outputs to the built model and then use the optimizer to train the network. Don't forget that we need to use the previous state while making predictions for the current state. Thus, we need to pass the output state back to the network so that it can be used during the prediction of the next input.

Let's provide initial values for our hyperparameters (you can tune them afterwards depending on the dataset you are using to train this architecture):


batch_size = 100 # Sequences per batch
num_steps = 100 # Number of sequence steps per batch
lstm_size = 512 # Size of hidden layers in LSTMs
num_layers = 2 # Number of LSTM layers
learning_rate = 0.001 # Learning rate
keep_probability = 0.5 # Dropout keep probability
epochs = 5

# Save a checkpoint N iterations
save_every_n = 100

LSTM_model = CharLSTM(len(language_vocab), batch_size=batch_size, num_steps=num_steps,
lstm_size=lstm_size, num_layers=num_layers,
learning_rate=learning_rate)

saver = tf.train.Saver(max_to_keep=100)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

# Use the line below to load a checkpoint and resume training
#saver.restore(sess, 'checkpoints/______.ckpt')
counter = 0
for e in range(epochs):
# Train network
new_state = sess.run(LSTM_model.initial_state)
loss = 0
for x, y in generate_character_batches(encoded_vocab, batch_size, num_steps):
counter += 1
start = time.time()
feed = {LSTM_model.inputs: x,
LSTM_model.targets: y,
LSTM_model.keep_prob: keep_probability,
LSTM_model.initial_state: new_state}
batch_loss, new_state, _ = sess.run([LSTM_model.loss,
LSTM_model.final_state,
LSTM_model.optimizer],
feed_dict=feed)

end = time.time()
print('Epoch number: {}/{}... '.format(e+1, epochs),
'Step: {}... '.format(counter),
'loss: {:.4f}... '.format(batch_loss),
'{:.3f} sec/batch'.format((end-start)))

if (counter % save_every_n == 0):
saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))

saver.save(sess, "checkpoints/i{}_l{}.ckpt".format(counter, lstm_size))

At the end of the training process, you should get an error close to this:

.
.
.
Epoch number: 5/5... Step: 978... loss: 1.7151... 0.050 sec/batch Epoch number: 5/5... Step: 979... loss: 1.7428... 0.051 sec/batch Epoch number: 5/5... Step: 980... loss: 1.7151... 0.050 sec/batch Epoch number: 5/5... Step: 981... loss: 1.7236... 0.050 sec/batch Epoch number: 5/5... Step: 982... loss: 1.7314... 0.051 sec/batch Epoch number: 5/5... Step: 983... loss: 1.7369... 0.051 sec/batch Epoch number: 5/5... Step: 984... loss: 1.7075... 0.065 sec/batch Epoch number: 5/5... Step: 985... loss: 1.7304... 0.051 sec/batch Epoch number: 5/5... Step: 986... loss: 1.7128... 0.049 sec/batch Epoch number: 5/5... Step: 987... loss: 1.7107... 0.051 sec/batch Epoch number: 5/5... Step: 988... loss: 1.7351... 0.051 sec/batch Epoch number: 5/5... Step: 989... loss: 1.7260... 0.049 sec/batch Epoch number: 5/5... Step: 990... loss: 1.7144... 0.051 sec/batch
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset