Model training and results analysis

Now, it's time to kick off the training process, which is very easy here:

Output:
rnn_type_model.fit(input_train_pad, target_train,
          validation_split=0.05, epochs=3, batch_size=64)

Output:
Train on 23750 samples, validate on 1250 samples
Epoch 1/3
23750/23750 [==============================]23750/23750 [==============================] - 176s 7ms/step - loss: 0.6698 - acc: 0.5758 - val_loss: 0.5039 - val_acc: 0.7784

Epoch 2/3
23750/23750 [==============================]23750/23750 [==============================] - 175s 7ms/step - loss: 0.4631 - acc: 0.7834 - val_loss: 0.2571 - val_acc: 0.8960

Epoch 3/3
23750/23750 [==============================]23750/23750 [==============================] - 174s 7ms/step - loss: 0.3256 - acc: 0.8673 - val_loss: 0.3266 - val_acc: 0.8600

Let's test the trained model against the test set:

model_result = rnn_type_model.evaluate(input_test_pad, target_test)

Output:
25000/25000 [==============================]25000/25000 [==============================] - 60s 2ms/step


print("Accuracy: {0:.2%}".format(model_result[1]))
Output:
Accuracy: 85.26%

Now, let's see an example of some misclassified texts.

So first, we calculate the predicted classes for the first 1,000 sequences in the test set and then we take the actual class values. We compare them and get a list of indices where this mismatch exists:

target_predicted = rnn_type_model.predict(x=input_test_pad[0:1000])
target_predicted = target_predicted.T[0]

Use the cut-off threshold to indicate that all values above 0.5 will be considered positive and the others will be considered negative:

class_predicted = np.array([1.0 if prob>0.5 else 0.0 for prob in target_predicted])

Now, let's get the actual class for these 1,000 sequences:

class_actual = np.array(target_test[0:1000])

Let's get the incorrect samples from the output:

incorrect_samples = np.where(class_predicted != class_actual)
incorrect_samples = incorrect_samples[0]
len(incorrect_samples)

Output:
122

So, we see that there are 122 of these texts that were incorrectly classified; that's 12.1% of the 1,000 texts we calculated here. Let's look at the first misclassified text:

index = incorrect_samples[0]
index

Output:
9

incorrectly_predicted_text = input_text_test[index]
incorrectly_predicted_text

Output:

'I am not a big music video fan. I think music videos take away personal feelings about a particular song.. Any song. In other words, creative thinking goes out the window. Likewise, Personal feelings aside about MJ, toss aside. This was the best music video of alltime. Simply wonderful. It was a movie. Yes folks it was. Brilliant! You had awesome acting, awesome choreography, and awesome singing. This was spectacular. Simply a plot line of a beautiful young lady dating a man, but was he a man or something sinister. Vincent Price did his thing adding to the song and video. MJ was MJ, enough said about that. This song was to video, what Jaguars are for cars. Top of the line, PERFECTO. What was even better about this was, that we got the real MJ without the thousand facelifts. Though ironically enough, there was more than enough makeup and costumes to go around. Folks go to Youtube. Take 14 mins. out of your life and see for yourself what a wonderful work of art this particular video really is.'

Let's have a look at the model output for this sample as well as the actual class:

target_predicted[index]

Output:
0.1529513

class_actual[index]
Output:
1.0

Now, let's test our trained model against a set of new data samples and see its results:

test_sample_1 = "This movie is fantastic! I really like it because it is so good!"
test_sample_2 = "Good movie!"
test_sample_3 = "Maybe I like this movie."
test_sample_4 = "Meh ..."
test_sample_5 = "If I were a drunk teenager then this movie might be good."
test_sample_6 = "Bad movie!"
test_sample_7 = "Not a good movie!"
test_sample_8 = "This movie really sucks! Can I get my money back please?"
test_samples = [test_sample_1, test_sample_2, test_sample_3, test_sample_4, test_sample_5, test_sample_6, test_sample_7, test_sample_8]

Now, let's convert them to integer tokens:

test_samples_tokens = tokenizer_obj.texts_to_sequences(test_samples)

And then pad them:

test_samples_tokens_pad = pad_sequences(test_samples_tokens, maxlen=max_num_tokens,
                           padding=seq_pad, truncating=seq_pad)
test_samples_tokens_pad.shape

Output:
(8, 544)

Finally, let's run the model against them:

rnn_type_model.predict(test_samples_tokens_pad)

Output:
array([[0.9496784 ],
 [0.9552593 ],
 [0.9115685 ],
 [0.9464672 ],
 [0.87672734],
 [0.81883633],
 [0.33248223],
 [0.15345531 ]], dtype=float32)

So, a value close to zero means a negative sentiment and a value that's close to 1 means a positive sentiment; finally, these numbers will vary every time you train the model.

Table of Contents for Model training and results analysis

Create new playlist

Sign In

Sign Up

Table of Contents for
Model training and results analysis