Start generating songs

Start the TensorFlow session and initialize all the variables:

sess = tf.Session()

init = tf.global_variables_initializer()

sess.run(init)

Now, we will look at how to generate the song lyrics using an RNN. What should the input and output to the RNN? How does it learn? What is the training data? Let's understand this an explanation, along with the code, step by step.

We know that in RNNs, the output predicted at a time step will be sent as the input to the next time step; that is, on every time step, we need to feed the predicted character from the previous time step as input. So, we prepare our dataset in the same way.

For instance, look at the following table. Let's suppose that each row is a different time step; on a time step , the RNN predicted a new character, g, as the output. This will be sent as the input to the next time step, .

However, if you notice the input in the time step , we removed the first character from the input o and added the newly predicted character g at the end of our sequence. Why are we removing the first character from the input? Because we need to maintain the sequence length.

Let's suppose that our sequence length is eight; adding a newly predicted character to our sequence increases the sequence length to nine. To avoid this, we remove the first character from the input, while adding a newly predicted character from the previous time step.

Similarly, in the output data, we also remove the first character on each time step, because once it predicts the new character, the sequence length increases. To avoid this, we remove the first character from the output on each time step, as shown in the following table:

Now, we will look at how we can prepare our input and output sequence similarly to the preceding table.

Define a variable called pointer, which points to the character in our dataset. We will set our pointer to 0, which means it points to the first character:

pointer = 0

Define the input data:

input_sentence = data[pointer: pointer + seq_length]

What does this mean? With the pointer and the sequence length, we slice the data. Consider that the seq_length is 25 and the pointer is 0. It will return the first 25 characters as input. So, data[pointer:pointer + seq_length] returns the following output:

"Look at her face, it's a "

Define the output, as follows:

output_sentence = data[pointer + 1: pointer + seq_length + 1]

We slice the output data with one character ahead moved from input data. So, data[pointer + 1:pointer + seq_length + 1] returns the following:

"ook at her face, it's a w"

As you can see, we added the next character in the preceding sentence and removed the first character. So, on every iteration, we increment the pointer and traverse the entire dataset. This is how we obtain the input and output sentence for training the RNN.

As we have learned, an RNN accepts only numbers as input. Once we have sliced the input and output sequence, we get the indices of the respective characters, using the char_to_ix dictionary that we defined:

input_indices = [char_to_ix[ch] for ch in input_sentence]
target_indices = [char_to_ix[ch] for ch in output_sentence]

Convert the indices into one-hot encoded vectors by using the one_hot_encoder function we defined previously:

input_vector = one_hot_encoder(input_indices)
target_vector = one_hot_encoder(target_indices)

This input_vector and target_vector become the input and output for training the RNN. Now, Let's start training.

The hprev_val variable stores the last hidden state of our trained RNN model which we use for making predictions, and we store the loss in a variable called loss_val:

hprev_val, loss_val, _ = sess.run([hprev, loss, updated_gradients], feed_dict={inputs: input_vector,targets: target_vector,init_state: hprev_val})

We train the model for n iterations. After training, we start making predictions. Now, we will look at how to make predictions and generate song lyrics using our trained RNN. Set the sample_length, that is, the length of the sentence (song) we want to generate:

sample_length = 500

Randomly select the starting index of the input sequence:

random_index = random.randint(0, len(data) - seq_length)

Select the input sentence with the randomly selected index:

sample_input_sent = data[random_index:random_index + seq_length]

As we know, we need to feed the input as numbers; convert the selected input sentence to indices:

sample_input_indices = [char_to_ix[ch] for ch in sample_input_sent]

Remember, we stored the last hidden state of the RNN in hprev_val. We used that for making predictions. We create a new variable called sample_prev_state_val by copying values from hprev_val.

The sample_prev_state_val is used as an initial hidden state for making predictions:

sample_prev_state_val = np.copy(hprev_val)

Initialize the list for storing the predicted output indices:

predicted_indices = []

Now, for t in range of sample_length, we perform the following and generate the song for the defined sample_length.

Convert the sampled_input_indices to the one-hot encoded vectors:

sample_input_vector = one_hot_encoder(sample_input_indices)

Feed the sample_input_vector, and also the hidden state sample_prev_state_val, as the initial hidden state to the RNN, and get the predictions. We store the output probability distribution in probs_dist:

probs_dist, sample_prev_state_val = sess.run([output_softmax, hprev],
 feed_dict={inputs: sample_input_vector,init_state: sample_prev_state_val})

Randomly select the index of the next character with the probability distribution generated by the RNN:

ix = np.random.choice(range(vocab_size), p=probs_dist.ravel())

Add this newly predicted index, ix, to the sample_input_indices, and also remove the first index from sample_input_indices to maintain the sequence length. This will form the input for the next time step:

sample_input_indices = sample_input_indices[1:] + [ix]

Store all the predicted chars indices in the predicted_indices list:

predicted_indices.append(ix)

Convert all the predicted_indices to their characters:

predicted_chars = [ix_to_char[ix] for ix in predicted_indices]

Combine all the predicted_chars and save it as text:

 text = ''.join(predicted_chars)

Print the predicted text on every 50,000^th iteration:

print ('
')
print (' After %d iterations' %(iteration))
print('
 %s 
' % (text,)) 
print('-'*115)

Increment the pointer and iteration:

pointer += seq_length
iteration += 1

On the initial iteration, you can see that the RNN has generated the random characters. But at the 50,000^th iteration, it has started to generate meaningful text:

 After 0 iterations

 Y?a6C.-eMSfk0pHD v!74YNeI 3YeP,h- h6AADuANJJv:HA(QXNeKzwCjBnAShbavSrGw7:ZcSv[!?dUno Qt?OmE-PdY wrqhSu?Yvxdek?5Rn'Pj!n5:32a?cjue  ZIj
Xr6qn.scqpa7)[MSUjG-Sw8n3ZexdUrLXDQ:MOXBMX EiuKjGudcznGMkF:Y6)ynj0Hiajj?d?n2Iapmfc?WYd BWVyB-GAxe.Hq0PaEce5H!u5t: AkO?F(oz0Ma!BUMtGtSsAP]Oh,1nHf5tZCwU(F?X5CDzhOgSNH(4Cl-Ldk? HO7 WD9boZyPIDghWUfY B:r5z9Muzdw2'WWtf4srCgyX?hS!,BL GZHqgTY:K3!wn:aZGoxr?zmayANhMKJsZhGjpbgiwSw5Z:oatGAL4Xenk]jE3zJ?ymB6v?j7(mL[3DFsO['Hw-d7htzMn?nm20o'?6gfPZhBa
NlOjnBd2n0 T"d'e1k?OY6Wwnx6d!F 

----------------------------------------------------------------------------------------------

 After 50000 iterations

 Hem-:]  
[Ex" what  
Akn'lise  
[Grout his bring bear.  
Gnow ourd?  
Thelf  
As cloume  
That hands, Havi Musking me Mrse your leallas, Froking the cluse (have: mes.  
I slok and if a serfres me the sky withrioni flle rome.....Ba tut get make ome  
But it lives I dive.  
[Lett it's to the srom of and a live me it's streefies  
And is.  
As it and is me dand a serray]  
[zrtye:"  
Chay at your hanyer  
[Every rigbthing with farclets  
  
[Brround.  
Mad is trie  
[Chare's a day-Mom shacke?

, I  

-------------------------------------------------------------------------------------------------

Table of Contents for Start generating songs

Create new playlist

Sign In

Sign Up

Table of Contents for
Start generating songs