How it works...

A neural network increases its efficiency when it improves its generalization power. A neural network should not just memorize a certain decision-making process in favor of a particular label. If it does, our outcomes will be biased and wrong. So, it is good to have a dataset where the labels are uniformly distributed. If they're not uniformly distributed, then we might have to adjust a few things while calculating the error rate. For this purpose, we introduced a weightsArray in step 1 and added to OutputLayer in step 2.

For weightsArray = {0.35, 0.65}, the network gives more priority to the outcomes of 1 (customer unhappy). As we discussed earlier in this chapter, the Exited column represents the label. If we observe the dataset, it is evident that outcomes labeled 0 (customer happy) have more records in the dataset compared to 1. Hence, we need to assign additional priority to 1 to evenly balance the dataset. Unless we do that, our neural network may over fit and will be biased toward the 1 label.

In step 3, we added ScoreIterationListener to log the training process on the console. Note that iterationCount is the number of iterations in which it should log the network score. Remember, iterationCount is not the epoch. We say an epoch has happened when the entire dataset has traveled back and forth (backpropagation) once through the whole neural network.

In step 8, we used dataSetIteratorSplitter to obtain the training dataset iterator and trained our model on top of it. If you configured loggers properly, you should see the training instance is progressing as shown here:

The score referred to in the screenshot is not the success rate; it is the error rate calculated by the error function for each iteration.

We configured the DL4J user interface (UI) in step 4, 5, and 6. DL4J provides a UI to visualize the current network status and training progress in your browser (real-time monitoring). This will help further tuning the neural network training. StatsListener will be responsible for triggering the UI monitoring while the training starts. The port number for UI server is 9000. While the training is in progress, hit the UI server at localhost:9000. We should be able to see something like the following:

We can refer to the first graph seen in the Overview section for the Model Score analysis. The Iteration is plotted on the x axis, and the Model Score is on the y axis in the graph.

We can also further expand our research on how the Activations, Gradients, and the Updates parameters performed during the training process by inspecting the parameter values plotted on graphs:

The x axis refers to the number of iterations in both the graphs. The y axis in the parameter update graph refers to the parameter update ratio, and the y axis in the activation/gradient graphs refers to the standard deviation.

It is possible to have layer-wise analysis. We just need to click on the Model tab on the left sidebar and choose the layer of choice for further analysis:

For analysis of memory consumption and JVM, we can navigate to the System tab on the left sidebar:

We can also review the hardware/software metrics in detail at the same place:

This is very useful for benchmarking as well. As we can see, the memory consumption of the neural network is clearly marked and the JVM/off-heap memory consumption is mentioned in the UI to analyze how well the benchmarking is done.

After step 8, evaluation results will be displayed on console:

In the above screenshot, the console shows various evaluation metrics by which the model is evaluated. We cannot rely on a specific metrics in all the cases; hence, it is good to evaluate the model against multiple metrics.

Our model is showing an accuracy level of 85.75% at the moment. We have four different performance metrics, named accuracy, precision, recall, and F1 score. As you can see in the preceding screenshot, recall metrics are not so good, which means our model still has false negative cases. The F1 score is also significant here, since our dataset has an uneven proportion of output classes. We will not discuss these metrics in detail, since they are outside the scope of this book. Just remember that all these metrics are important for consideration, rather than just relying on accuracy alone. Of course, the evaluation trade-offs vary depending upon the problem. The current code has already been optimized. Hence, you will find almost stable accuracy from the evaluation metrics. For a well-trained network model, these performance metrics will have values close to 1.

It is important to check how stable our evaluation metrics are. If we notice unstable evaluation metrics for unseen data, then we need to reconsider changes in the network configuration.

Activation functions on the output layer have influence on the stability of the outputs. Hence, a good understanding on output requirements will definitely save you a lot of time choosing an appropriate output function (loss function). We need to ensure stable predictive power from our neural network.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...