Generating song lyrics using RNNs

We have learned enough about RNNs; now, we will look at how to generate song lyrics using RNNs. To do this, we simply build a character-level RNN, meaning that on every time step, we predict a new character.

Let's consider a small sentence, What a beautiful d.

At the first time step, the RNN predicts a new character as a. The sentence will be updated to, What a beautiful da.

At the next time step, it predicts a new character as y, and the sentence becomes, What a beautiful day.

In this manner, we predict a new character at each time step and generate a song. Instead of predicting a new character every time, we can also predict a new word every time, which is called word level RNN. For simplicity, let's start with a character level RNN.

But how does RNN predicts a new character on each time step? Let's suppose at a time step t=0, we feed an input character say x. Now the RNN predicts the next character based on the given input character x. To predict the next character, it predicts the probability of all the characters in our vocabulary to be the next character. Once we have this probability distribution we randomly select the next character based on this probability. Confusing? Let us better understand this with an example.

For instance, as shown in the following figure, let's suppose that our vocabulary contains four characters L, O, V, and E; when we feed the character L as an input, RNN computes the probability of all the words in the vocabulary to be the next character:

So, we have the probabilities as [0.0, 0.9, 0.0, 0.1], corresponding to the characters in the vocabulary [L,O,V,E]. With this probability distribution, we select O as the next character 90% of the time, and E as the next character 10% of the time. Predicting the next character by sampling from this probability distribution adds some randomness to the output.

On the next time step, we feed the predicted character from the previous time step and the previous hidden state as an input to predict the next character, as shown in the following figure:

So, on each time step, we feed the predicted character from the previous time step and the previous hidden state as input and predict the next character shown as follows:

As you can see in the preceding figure, at time step t=2, V is passed as an input, and it predicts the next character as E. But this does not mean that every time character V is sent as an input it should always return E as output. Since we are passing input along with the previous hidden state, the RNN has the memory of all the characters it has seen so far.

So, the previous hidden state captures the essence of the previous input characters, which are L and O. Now, with this previous hidden state and the input V, the RNN predicts the next character as E.

Table of Contents for Generating song lyrics using RNNs

Create new playlist

Sign In

Sign Up

Table of Contents for
Generating song lyrics using RNNs