Train model

We have our data split into the following:

train: The user matrix we will use to build our recommendation model.
test.known: We will feed this for our predict method along with our model. The output can now be compared to our test dataset.
test: The test dataset is used for evaluating our model

Using these, let us go ahead and build our recommender system. For the first model, we are going to build a random model.

Building the random model looks as follows:

> random.model <- Recommender(train, "RANDOM")
> random.model
Recommender of type 'RANDOM' for 'realRatingMatrix' 
learned using 1350 users.
> getModel(random.model)
$range
[1] -5.998287  5.224277

$labels
  [1] "j1"   "j2"   "j3"   "j4"   "j5"   "j6"   "j7"   "j8"   "j9"   "j10"  "j11"  "j12"  "j13"  "j14" 
 [15] "j15"  "j16"  "j17"  "j18"  "j19"  "j20"  "j21"  "j22"  "j23"  "j24"  "j25"  "j26"  "j27"  "j28" 
 [29] "j29"  "j30"  "j31"  "j32"  "j33"  "j34"  "j35"  "j36"  "j37"  "j38"  "j39"  "j40"  "j41"  "j42" 
 [43] "j43"  "j44"  "j45"  "j46"  "j47"  "j48"  "j49"  "j50"  "j51"  "j52"  "j53"  "j54"  "j55"  "j56" 
 [57] "j57"  "j58"  "j59"  "j60"  "j61"  "j62"  "j63"  "j64"  "j65"  "j66"  "j67"  "j68"  "j69"  "j70" 
 [71] "j71"  "j72"  "j73"  "j74"  "j75"  "j76"  "j77"  "j78"  "j79"  "j80"  "j81"  "j82"  "j83"  "j84" 
 [85] "j85"  "j86"  "j87"  "j88"  "j89"  "j90"  "j91"  "j92"  "j93"  "j94"  "j95"  "j96"  "j97"  "j98" 
 [99] "j99"  "j100"

> 

> random.predict <- predict(random.model, test.known, n = 5, type = "topNList")
> random.predict@items[1]
[[1]]
[1] 81 96 47 28 65

> random.predict@ratings[1]
[[1]]
[1] 2.516658 2.139329 1.921027 1.900945 1.845772

Using the Recommender function, we have built a random model. Inspecting the model, we see that this model will produce a random number between -5.9 and 5.2 as defined by the range. The model does not learn anything, except the range of the ratings in the training set and produces the requested rating.

The predict function is used to predict the ratings. We need the top N predictions, that is the top five jokes the user would have rated. Is it the top 5 jokes of the 10 known (i.e. holdout) jokes for a specific user? This is specified by the n parameter and the type parameter. Finally, we can look into our predictions, random.predict, to find our recommendations and ratings for an individual user. In the preceding code snippet, we look for user 1. We see that jokes 81, 96, 47, 28, and 65 are recommended for this user.

Its a best practice to build a reference model to begin with before we start building our actual models. When we have our first model, we can compare it with our reference model to see some performance gain. Any good model should be better than a random model. Once we have a good model we can replace the random model with the good model and proceed to find better models. Random models, since they are very easy to create are typically used as the first reference model.

Let us build a more sensible model. The one we have seen in the previous sections, called the popular model.

Building popular model looks as follows:

> popular.model <- Recommender(train, "POPULAR")
> popular.model
Recommender of type 'POPULAR' for 'realRatingMatrix' 
learned using 1350 users.
>

We build a model using the popular type. We have trained it with 1,350 user data.

Let us predict using this model:

> popular.model
Recommender of type 'POPULAR' for 'realRatingMatrix' 
learned using 1350 users.
> popular.predict <- predict(popular.model, test.known, n = 5, type = "topNList")
> popular.predict@items[1]
$u24654
[1] 50 36 27 32 35

> popular.predict@ratings[1]
$u4027
[1] 3.472511 3.218649 3.157963 3.011908 2.976325

>

Using the predict function and using the test.known dataset, we try to predict the top five recommendations for users in the test dataset. Remember that in the known parameter, we had reserved 10 recommendations for each user, stored in test.known. Using that we are now predicting other jokes for the user.

Before we proceed to build a better model, let us introduce the evaluate function.

Till now we had explicitly invoked the Recommender and create a recommender object. Further passed that model to predict function to get our predictions. This is the correct way of implementing a model. However, when are evaluating different models, we need a convenient way to run and test multiple models.

With evaluate function, we don't have to do those explicit steps.

The evaluate function can take an evaluation scheme and apply it to a given model and produce the performance of the recommendation model. The performance is given using multiple metrics.

Let us see how the evaluate function is used:

> results <- evaluate(plan, method = "POPULAR", type = "topNList", n = 5 )
POPULAR run fold/sample [model time/prediction time]
     1  [0.012sec/0.215sec] 
> getConfusionMatrix(results)
[[1]]
        TP       FP       FN       TN precision    recall       TPR        FPR
5 1.993333 3.006667 12.29333 72.70667 0.3986667 0.1626843 0.1626843 0.03823596

>

As you can see, we have passed the plan, which is our evaluation scheme and the method. We request five recommendation for each user using parameters k and type.

The getConfusionMatrix function is used to provide us with different metrics to evaluate our recommendation engine. Using the test known data set and the constructed model, predictions are made. These predictions are compared with the test unknown dataset. Let us say we have a user A in test known and test unknown. The 10 items, defined by the given parameter, and their ratings in test known are used to find the recommendations for this user. The recommended items are compared against the test unknown dataset for that user.

While comparing the ratings, the parameter goodRating is used. Let us say we have set our goodRating parameter to 5. If the predicted rating is 5.1 and the actual rating is 7.1, they are considered as a match. Our recommendation is considered as a match with the test set. This way we can calculate the metrics we have detailed.

TP stands for True Positive, the number of instances where our recommendation matched with the actual recommendation.

FP stands for False Positive, the number of instances where we made a recommendation, but the actual data shows otherwise.

FN stands for False Negative, where we did not recommend, but the actual data shows otherwise.

TN stands for True Negative, where we did not recommend and the actual data was in agreement with us.

Precision is the ratio of the number of correct recommendations out of the total recommendations.

Recall is the ratio of the number of correct recommendations out of the total correct recommendations.

TPR stands for True Positive Rate, the number of true positives to the total positives ( true positive + false negative).

FPR stands for False Positive Rate, the ratio of false positives to the sum of false positives and true negatives.

Refer to https://en.wikipedia.org/wiki/Confusion_matrix for more information about the confusion matrix.

Until now we have used the split scheme; now let us use the n fold cross-validation scheme. Further, we will be only using the evaluate function to find the performance of our recommendation model.

The n fold cross-validation is as follows:

> plan <- evaluationScheme(data, method="cross", train=0.9, given = 10, goodRating=5)
> results <- evaluate(plan, method = "POPULAR", type = "topNList", n = c(5,10,15) )
POPULAR run fold/sample [model time/prediction time]
     1  [0.013sec/0.216sec] 
     2  [0.011sec/0.219sec] 
     3  [0.011sec/0.232sec] 
     4  [0.013sec/0.224sec] 
     5  [0.013sec/0.217sec] 
     6  [0.011sec/0.205sec] 
     7  [0.012sec/0.223sec] 
     8  [0.016sec/0.226sec] 
     9  [0.011sec/0.22sec] 
     10  [0.012sec/0.233sec] 
> avg(results)
         TP       FP       FN       TN precision    recall       TPR        FPR
5  2.056000 2.944000 14.35467 70.64533 0.4112000 0.1672573 0.1672573 0.03842095
10 3.962667 6.037333 12.44800 67.55200 0.3962667 0.3088684 0.3088684 0.07883053
15 5.644000 9.356000 10.76667 64.23333 0.3762667 0.4214511 0.4214511 0.12230877

Instead of the train test split scheme, here we are doing a cross-validation. By default, it does a 10-fold cross-validation. Finally, our results is an average of 10 cross-validations for three different top Ns. As you can see, we have passed n a vector with three values, 5, 10, and 15, so we can generate 5, 10, or 15 recommendations.

Table of Contents for Train model

Create new playlist

Sign In

Sign Up

Table of Contents for
Train model