Evaluation Of the RNN Model

Let look at our results. Once the model is trained we can feed the test data that we prepared earlier in this chapter and evaluate the predictions. In this case, we will use few different metrics to evaluate our model which are: precision, recall, and F1 scores.

To evaluate your model it is important to choose right kind of metrics and F1 scores are considered more practical as compared to the accuracy score.

Key points to understand them in simple terms:

Accuracy: The count of correct predictions divided by the count of total examples that have been evaluated.
Precision: High precision means you identified nearly all positives appropriately, low precision score means you often incorrectly predicted a positive when there was none.
Recall: High recall means you correctly predicted most all of the real positives present in the data, a low score means you frequently missed positives that were present.
F1-score: The balanced harmonic mean of Recall and Precision, giving both metrics equal weight. The higher the F-Measure is, the better.

logits = sess.run(model.logits, feed_dict={model.X:str_idx(test_X,dictionary,maxlen)})
print(metrics.classification_report(test_Y, np.argmax(logits,1), target_names = trainset.target_names))

So here we can see that our average F1 score is 66% while using basic RNN cells. Let's see if this can be improved over by using other variations of RNN architectures.

Table of Contents for Evaluation Of the RNN Model

Create new playlist

Sign In

Sign Up

Table of Contents for
Evaluation Of the RNN Model