How it works...

In step 1, we used a computation graph when configuring the neural network's structure. Computation graphs are the best choice for recurrent neural networks. We get an evaluation score of approximately 78% with a multi-layer network and a whopping 94% while using a computation graph. We get better results with ComputationGraph than the regular multi-layer perceptron. ComputationGraph is meant for complex network structures and can be customized to accommodate different types of layers in various orders. InvocationType.EPOCH_END is used (score iteration) in step 1 to call the score iterator at the end of a test iteration.

Note that we're calling the score iterator for every test iteration, and not for the training set iteration. Proper listeners need to be set by calling setListeners() before your training event starts to log the scores for every test iteration, as shown here:

model.setListeners(new ScoreIterationListener(20), new EvaluativeListener(testIterator, 1, InvocationType.EPOCH_END));

In step 4, the model was evaluated by calling evaluate():

Evaluation evaluation = model.evaluate(testIterator);

We passed the test dataset to the evaluate() method in the form of an iterator that was created earlier in the Loading the training data recipe.

Also, we use the stats() method to display the results. For a computation graph with 100 epochs, we get the following evaluation metrics:

Now, the following are the experiments you can perform to optimize the results even better.

We used 100 epochs in our example. Reduce the epochs from 100 or increase this setting to a specific value. Note the direction that gives better results. Stop when the results are optimal. We can evaluate the results once in every epoch to understand the direction in which we can proceed. Check out the following training instance logs:

The accuracy declines after the previous epoch in the preceding example. Accordingly, you can decide on the optimal number of epochs. The neural network will simply memorize the results if we go for large epochs, and this leads to overfitting.

Instead of randomizing the data at first, you can ensure that the six categories are uniformly distributed across the training set. For example, we can have 420 samples for training and 180 samples for testing. Then, each category will be represented by 70 samples. We can now perform randomization followed by iterator creation. Note that we had 450 samples for training in our example. In this case, the distribution of labels/categories isn't unique and we are totally relying on the randomization of data in this case.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...