Generating text

We have a trained model based on our input dataset. The next step is to use this trained model to generate text and see how this model learned the style and structure of the input data. To do this, we can start with some initial characters and then feed the new, predicted one as an input in the next step. We will repeat this process until we get a text with a specific length.

In the following code, we have also added extra statements to the function to prime the network with some initial text and start from there.

The network gives us predictions or probabilities for each character in the vocab. To reduce noise and only use the ones that the network is more confident about, we're going to only choose a new character from the top N most probable characters in the output:

def choose_top_n_characters(preds, vocab_size, top_n_chars=4):
p = np.squeeze(preds)
p[np.argsort(p)[:-top_n_chars]] = 0
p = p / np.sum(p)
c = np.random.choice(vocab_size, 1, p=p)[0]
return c
def sample_from_LSTM_output(checkpoint, n_samples, lstm_size, vocab_size, prime="The "):
samples = [char for char in prime]
LSTM_model = CharLSTM(len(language_vocab), lstm_size=lstm_size, sampling=True)
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, checkpoint)
new_state = sess.run(LSTM_model.initial_state)
for char in prime:
x = np.zeros((1, 1))
x[0,0] = vocab_to_integer[char]
feed = {LSTM_model.inputs: x,
LSTM_model.keep_prob: 1.,
LSTM_model.initial_state: new_state}
preds, new_state = sess.run([LSTM_model.prediction, LSTM_model.final_state],
feed_dict=feed)

c = choose_top_n_characters(preds, len(language_vocab))
samples.append(integer_to_vocab[c])

for i in range(n_samples):
x[0,0] = c
feed = {LSTM_model.inputs: x,
LSTM_model.keep_prob: 1.,
LSTM_model.initial_state: new_state}
preds, new_state = sess.run([LSTM_model.prediction, LSTM_model.final_state],
feed_dict=feed)

c = choose_top_n_characters(preds, len(language_vocab))
samples.append(integer_to_vocab[c])

return ''.join(samples)

Let's start the sampling process using the latest checkpoint saved:

tf.train.latest_checkpoint('checkpoints')
Output:
'checkpoints/i990_l512.ckpt'

Now, it's time to sample using this latest checkpoint:

checkpoint = tf.train.latest_checkpoint('checkpoints')
sampled_text = sample_from_LSTM_output(checkpoint, 1000, lstm_size, len(language_vocab), prime="Far")
print(sampled_text)
Output:
INFO:tensorflow:Restoring parameters from checkpoints/i990_l512.ckpt
Farcial the
confiring to the mone of the correm and thinds. She
she saw the
streads of herself hand only astended of the carres to her his some of the princess of which he came him of
all that his white the dreasing of
thisking the princess and with she was she had
bettee a still and he was happined, with the pood on the mush to the peaters and seet it.

"The possess a streatich, the may were notine at his mate a misted
and the
man of the mother at the same of the seem her
felt. He had not here.

"I conest only be alw you thinking that the partion
of their said."

"A much then you make all her
somether. Hower their centing
about
this, and I won't give it in
himself.
I had not come at any see it will that there she chile no one that him.

"The distiction with you all.... It was
a mone of the mind were starding to the simple to a mone. It to be to ser in the place," said Vronsky.
"And a plais in
his face, has alled in the consess on at they to gan in the sint
at as that
he would not be and t

You can see that we were able to generate some meaningful words and some meaningless words. In order to get more results, you can run the model for more epochs and try to play with the hyperparameters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset