Experimenting with reduced batch size

The code that we'll be using for this experiment is as follows:

# Model architecture
model <- keras_model_sequential() %>%
layer_embedding(input_dim = 1500,
output_dim = 32,
input_length = 400) %>%
layer_conv_1d(filters = 32,
kernel_size = 5,
padding = "valid",
activation = "relu",
strides = 1) %>%
layer_max_pooling_1d(pool_size = 4) %>%
layer_dropout(0.25) %>%
layer_lstm(units = 32) %>%
layer_dense(units = 50, activation = "softmax")

# Compiling the model
model %>% compile(optimizer = "adam",
loss = "categorical_crossentropy",
metrics = c("acc"))

# Fitting the model
model_two <- model %>% fit(trainx, trainy,
epochs = 30,
batch_size = 16,
validation_data = list(validx, validy))

# Plot of loss and accuracy
plot(model_two)

From the preceding code, we can make the following observations:

  • We will update the model architecture by specifying input_dim as 1,500 and input_length as 400.
  • We will reduce the batch size that's used at the time of fitting the model from 32 to 16.
  • To address the overfitting problem, we have added a dropout layer with a rate of 25%.
  • We have kept all other settings the same as those we had used for the previous model.

The loss and accuracy values based on the training and validation data for each of the 30 epochs is stored in model_two. The results can be seen in the following plot:

The preceding plot indicates that the loss and accuracy values for the validation data stay flat for the last few epochs. However, they do not deteriorate. Next, we will obtain the loss and accuracy values based on the training and test data using the evaluate function, as follows:

# Loss and accuracy for train data
model %>% evaluate(trainx, trainy)
$loss
[1] 0.3890106
$acc
[1] 0.9133034

# Loss and accuracy for test data
model %>% evaluate(testx, testy)
$loss
[1] 2.710119
$acc
[1] 0.308

From the preceding code and output, we can observe that the loss and accuracy values for the training data show better results compared to the previous model. However, for the test data, although the accuracy value is better, the loss value is slightly worse.

The accuracy that was obtained by correctly classifying the articles in the testing data from each author can be seen in the following bar plot:

From the preceding bar plot, we can make the following observations:

  • The bar plot visually shows improvements compared to the previous model.
  • In the previous model, for the test data, we had four authors with no articles classified correctly. However, now, we don't have any authors with no correct classification.

In the next experiment, we will look at more changes we can make in an effort to improve the author's classification performance even further.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset