Log transformation on the output variable

To overcome the issue of significant underestimation of the target variable at higher values, let's try log transformation on the target variable and see whether or not this helps us to further improve the model. Our next model has some minor changes to the architecture as well. In model_two, we did not notice any major issue or evidence related to overfitting, and as a result, we can increase the number of units a little and also slightly reduce the percentages for dropout. The following is the code for this experiment:

# log transformation and model architecture 
trainingtarget <- log(trainingtarget)
testtarget <- log(testtarget)
model <- keras_model_sequential()
model %>%
layer_dense(units = 100, activation = 'relu', input_shape = c(13)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 50, activation = 'relu') %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 25, activation = 'relu') %>%
layer_dropout(rate = 0.1) %>%
layer_dense(units = 1)
summary(model)

OUTPUT
## ___________________________________________________________________________
## Layer (type) Output Shape Param #
## ===========================================================================
## dense_8 (Dense) (None, 100) 1400
## ___________________________________________________________________________
## dropout_4 (Dropout) (None, 100) 0
## ___________________________________________________________________________
## dense_9 (Dense) (None, 50) 5050
## ___________________________________________________________________________
## dropout_5 (Dropout) (None, 50) 0
## ___________________________________________________________________________
## dense_10 (Dense) (None, 25) 1275
## ___________________________________________________________________________
## dropout_6 (Dropout) (None, 25) 0
## ___________________________________________________________________________
## dense_11 (Dense) (None, 1) 26
## ===========================================================================
## Total params: 7,751
## Trainable params: 7,751
## Non-trainable params: 0
## ___________________________________________________________________________

We will increase the number of units in the third hidden layer from 20 to 25. Dropout rates for the second and third hidden layers are also reduced to 0.2 and 0.1 respectively. Note that the overall number of parameters has now increased to 7751.

We next compile the model and then fit the model. The model results are stored in model_three, which we use for plotting the graph, as shown in the following code:

# Compile model
model %>% compile(loss = 'mse',
optimizer = optimizer_rmsprop(lr = 0.005),
metrics = 'mae')

# Fit model
model_three <- model %>%
fit(training,
trainingtarget,
epochs = 100,
batch_size = 32,
validation_split = 0.2)
plot(model_three)

The following shows the output of the loss and mean absolute error for training and validation data (model_three):

We can see from the preceding plot that although the values in the plot are not directly comparable to earlier figures because of the log transformation, we can see that the overall errors decrease and become stable after about 50 epochs for both the mean absolute error and the loss.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset