Mini-batch generation for training

In this section, we will divide our data into small batches to be used for training. So, the batches will consist of many sequences of desired number of sequence steps. So, let's look at a visual example in Figure 11:

Figure 11: Illustration of how batches and sequences would look like (source: http://oscarmore2.github.io/Anna_KaRNNa_files/charseq.jpeg)

So, now we need to define a function that will iterate through the encoded text and generate the batches. In this function we will be using a very nice mechanism of Python called yield (link: https://jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/).

A typical batch will have N × M characters, where N is the number of sequences and M is, number of sequence steps. For getting the number of possible batches in our dataset, we can simply divide the length of the data by the desired batch size and after getting this number of possible batches, we can drive how many characters should be in each batch.

After that, we need to split the dataset we have into a desired number of sequences (N). We can use arr.reshape(size). We know we want N sequences (num_seqs is used in, following code), let's make that the size of the first dimension. For the second dimension, you can use -1 as a placeholder in the size; it'll fill up the array with the appropriate data for you. After this, you should have an array that is N × (M * K), where K is the number of batches.

Now that we have this array, we can iterate through it to get the training batches, where each batch has N × M characters. For each subsequent batch, the window moves over by num_steps. Finally, we also want to create both the input and output arrays for ours to be used as the model input. This step of creating the output values is very easy; remember that the targets are the inputs shifted over one character. You'll usually see the first input character used as the last target character, so something like this:

Where x is the input batch and y is the target batch.

The way I like to do this window is to use range to take steps of size num_steps, starting from 0 to arr.shape[1], the total number of steps in each sequence. That way, the integers you get from the range always point to the start of a batch, and each window is num_steps wide:

def generate_character_batches(data, num_seq, num_steps):
    '''Create a function that returns batches of size
       num_seq x num_steps from data.
    '''
    # Get the number of characters per batch and number of batches
    num_char_per_batch = num_seq * num_steps
    num_batches = len(data)//num_char_per_batch
    
    # Keep only enough characters to make full batches
    data = data[:num_batches * num_char_per_batch]
    
    # Reshape the array into n_seqs rows
    data = data.reshape((num_seq, -1))
    
    for i in range(0, data.shape[1], num_steps):
        # The input variables
        input_x = data[:, i:i+num_steps]
        
        # The output variables which are shifted by one
        output_y = np.zeros_like(input_x)
        
        output_y[:, :-1], output_y[:, -1] = input_x[:, 1:], input_x[:, 0]
        yield input_x, output_y

So, let's demonstrate this using this function by generating a batch of 15 sequences and 50 sequence steps:

generated_batches = generate_character_batches(encoded_vocab, 15, 50)
input_x, output_y = next(generated_batches)
print('input
', input_x[:10, :10])
print('
target
', output_y[:10, :10])
Output:

input
 [[70 34 54 29 24 19 76 45 2 79]
 [45 19 44 15 16 15 82 44 19 45]
 [11 45 44 15 16 34 24 38 34 19]
 [45 34 54 64 45 82 19 19 56 45]
 [45 11 56 19 45 27 56 19 35 79]
 [49 19 54 76 12 45 44 54 12 24]
 [45 41 19 45 16 11 45 15 56 24]
 [11 35 45 24 11 45 39 54 27 19]
 [82 19 66 11 76 19 45 81 19 56]
 [12 54 16 19 45 44 15 27 19 45]]

target
 [[34 54 29 24 19 76 45 2 79 79]
 [19 44 15 16 15 82 44 19 45 16]
 [45 44 15 16 34 24 38 34 19 54]
 [34 54 64 45 82 19 19 56 45 82]
 [11 56 19 45 27 56 19 35 79 35]
 [19 54 76 12 45 44 54 12 24 45]
 [41 19 45 16 11 45 15 56 24 11]
 [35 45 24 11 45 39 54 27 19 33]
 [19 66 11 76 19 45 81 19 56 24]
 [54 16 19 45 44 15 27 19 45 24]]

Next up, we'll be looking forward to building the core of this example, which is the LSTM model.

Table of Contents for Mini-batch generation for training

Create new playlist

Sign In

Sign Up

Table of Contents for
Mini-batch generation for training