Production ready recommender engines

In this chapter so far, we have learnt about recommender engines in detail and even developed one from scratch (using matrix factorization). Through all this, it is clearly evident how widespread the application of such systems is.

E-commerce websites (or for that fact, any popular technology platform) out there today have tones of content to offer. Not only that, but the number of users is also huge. In such a scenario, where thousands of users are browsing/buying stuff simultaneously across the globe, providing recommendations to them is a task in itself. To complicate things even further, a good user experience (response times, for example) can create a big difference between two competitors. These are live examples of production systems handling millions of customers day in and day out.

Note

Fun Fact

Amazon.com is one of the biggest names in the e-commerce space with 244 million active customers. Imagine the amount of data being processed to provide recommendations to such a huge customer base browsing through millions of products!

Source: http://www.amazon.com/b?ie=UTF8&node=8445211011

In order to provide a seamless capability for use in such platforms, we need highly optimized libraries and hardware. For a recommender engine to handle thousands of users simultaneously every second, R has a robust and reliable framework called the recommenderlab.

Recommenderlab is a widely used R extension designed to provide a robust foundation for recommender engines. The focus of this library is to provide efficient handling of data, availability of standard algorithms and evaluation capabilities. In this section, we will be using recommenderlab to handle a considerably larger data set for recommending items to users. We will also use the evaluation functions from recommenderlab to see how good or bad our recommendation system is. These capabilities will help us build a production ready recommender system similar (or at least closer) to what many online applications such as Amazon or Netflix use.

The dataset used in this section contains ratings for 100 items as rated by 5000 users. The data has been anonymized and the product names have been replaced by product IDs. The rating scale used is 0 to 5 with 1 being the worst, 5 being the best, and 0 representing unrated items or missing ratings.

To build a recommender engine using recommenderlab for a production ready system, the following steps are to be performed:

  1. Extract, transform, and analyze the data.
  2. Prepare a recommendation model and generate recommendations.
  3. Evaluate the recommendation model.

We will look at all these steps in the following subsections.

Extract, transform, and analyze

As in case of any data intensive (particularly machine learning) application, the first and foremost step is to get the data, understand/explore it, and then transform it into the format required by the algorithm deemed fit for the current application. For our recommender engine using the recommenderlab package, we will first load the data from a csv file described in the previous section and then explore it using various R functions.

# Load recommenderlab library
library("recommenderlab")

# Read dataset from csv file
raw_data <- read.csv("product_ratings_data.csv")

# Create rating matrix from data 
ratings_matrix<- as(raw_data, "realRatingMatrix")

#view transformed data
image(ratings_matrix[1:6,1:10])

The preceding section of code loads the recommenderlab package and then uses the standard utility function to read the product_ratings_data.csv file. For exploratory as well as further steps, we need the data to be transformed into the user-item ratings matrix format (as described in the Core concepts and definitions section).

The as(<data>,<type>) utility converts csv into the required ratings matrix format.

The csv file contains data in the format shown in the following screenshot. Each row contains a user's rating for a specific product. The column headers are self explanatory.

Extract, transform, and analyze

Product ratings data

The realRatingMatrix conversion transforms the data into a matrix as shown in the following image. The users are depicted as rows while the columns represent the products. Ratings are represented using a gradient scale where white represents missing/unrated rating while black denotes a rating of 5/best.

Extract, transform, and analyze

Ratings matrix representation of our data

Now that we have the data in our environment, let us explore some of its characteristics and see if we can decipher some key patterns.

First of all, we extract a representative sample from our main data set (refer to the screenshot Product ratings data) and analyze it for:

  • Average rating score for our user population
  • Spread/distribution of item ratings across the user population
  • Number of items rated per user

The following lines of code help us explore our data set sample and analyze the points mentioned previously:

# Extract a sample from ratings matrix
sample_ratings <-sample(ratings_matrix,1000)

# Get the mean product ratings as given by first user
rowMeans(sample_ratings[1,])


# Get distribution of item ratings
hist(getRatings(sample_ratings), breaks=100,/
     xlab = "Product Ratings",main = " Histogram of Product Ratings")

# Get distribution of normalized item ratings
hist(getRatings(normalize(sample_ratings)),breaks=100,/
            xlab = "Normalized Product Ratings",main = /
                " Histogram of Normalized Product Ratings")

# Number of items rated per user
hist(rowCounts(sample_ratings),breaks=50,/
     xlab = "Number of Products",main =/
     " Histogram of Product Count Distribution")

We extract a sample of 1,000 users from our dataset for exploration purposes. The mean of product ratings as given by the first row in our user-rating sample is 2.055. This tells us that this user either hasn't seen/rated many products or he usually rates the products pretty low. To get a better idea of how the users rate products, we generate a histogram of item rating distribution. This distribution peaks around the middle, that is, 3. The histogram is shown next:

Extract, transform, and analyze

Histogram for ratings distribution

The histogram shows that the ratings are normally distributed around the mean with low counts for products with very high or very low ratings.

Finally, we check the spread of the number of products rated by the users. We prepare a histogram which shows this spread:

Extract, transform, and analyze

Histogram of number of rated products

The preceding histogram shows that there are many users who have rated 70 or more products, as well as there are many users who have rated all 100 products.

The exploration step helps us get an idea of how our data is. We also get an idea about the way the users generally rate the products and how many products are being rated.

Model preparation and prediction

We have the data in our R environment which has been transformed into the ratings matrix format. In this section, we are interested in preparing a recommender engine based on user-based collaborative filtering. We will be using similar terminology as described in the previous sections. Recommenderlab provides straight-forward utilities to learn and prepare a model for building recommender engines.

We prepare our model based upon a sample of just 1,000 users. This way, we can use this model to predict the missing ratings for the rest of the users in our ratings matrix. The following lines of code utilize the first thousand rows for learning the model:

# Create 'User Based collaborative filtering' model 
ubcf_recommender <- Recommender(ratings_matrix[1:1000],"UBCF")

"UBCF" in the preceding code signifies user-based collaborative filtering. Recommenderlab also provides other algorithms, such as IBCF or Item-Based Collaborative Filtering, PCA or Principal Component Analysis, and others as well.

After preparing the model, we use it to predict the ratings for our 1,010th and 1,011th users in the system. Recommenderlab also requires us to mention the number of items to be recommended to the users (in the order of preference of course). For the current case, we mention 5 as the number of items to be recommended.

# Predict list of product which can be recommended to given users
recommendations <- predict(ubcf_recommender,/
                  ratings_matrix[1010:1011], n=5)

# show recommendation in form of the list
as(recommendations, "list")

The preceding lines of code generate two lists, one for each of the users. Each element in these lists is a product for recommendation. The model predicted that, for user 1,010, product prod_93 should be recommended as the top most product followed by prod_79, and so on.

# output generated by the model
[[1]]
[1] "prod_93" "prod_79" "prod_80" "prod_83" "prod_89"

[[2]]
[1] "prod_80" "prod_85" "prod_87" "prod_75" "prod_79"

Recommenderlab is a robust platform which is optimized to handle large datasets. With a few lines of code, we were able to load the data, learn a model, and even recommend products to the users in virtually no time. Compare this with the basic recommender engine we developed using matrix factorization which involved many lines of code (when compared to recommenderlab) apart from the obvious difference in performance.

Model evaluation

We have successfully prepared a model and used it for predicting and recommending products to the users in our system. But what do we know about the accuracy of our model? To evaluate the prepared model, recommenderlab has handy and easy to use utilities. Since we need to evaluate our model, we need to split it into training and test data sets. Also, recommenderlab requires us to mention the number of items to be used for testing (it uses the rest for computing the error).

For the current case, we will use 500 users to prepare an evaluation model. The model will be based on a 90-10 training-testing dataset split with 15 items used for test sets.

# Evaluation scheme
eval_scheme <- evaluationScheme(ratings_matrix[1:500],/
                      method="split",train=0.9,given=15)

# View the evaluation scheme
eval_scheme

# Training model
training_recommender <- Recommender(getData(eval_scheme,/
                       "train"), "UBCF")

# Preditions on the test dataset
test_rating <- predict(training_recommender,/
               getData(eval_scheme, "known"), type="ratings")

#Error 
error <- calcPredictionAccuracy(test_rating,/
                   getData(eval_scheme, "unknown"))

error

We use the evaluation scheme to train our model based on the UBCF algorithm. The prepared model from the training dataset is used to predict ratings for the given items. We finally use the method calcPredictionAccuracy to calculate the error in predicting the ratings between known and unknown components of the test set. For our case, we get an output as follows:

Model evaluation

The generated output mentions the values for RMSE or root mean squared error, MSE or mean squared error, and MAE or mean absolute error. For RMSE in particular, the values deviate from the correct values by 1.162 (note that the values might deviate slightly across runs due to various factors such as sampling, iterations, and so on). This evaluation will make more sense when the outcomes are compared from different CF algorithms.

To evaluate UBCF, we use IBCF as comparison. The following few lines of code help us prepare an IBCF based model and test the ratings, which can then be compared using the calcPredictionAccuracy utility:

# Training model using IBCF
training_recommender_2 <- Recommender(getData(eval_scheme,/
                                     "train"), "IBCF")

# Preditions on the test dataset
test_rating_2 <- predict(training_recommender_2,/
                  getData(eval_scheme, "known"),/
                type="ratings")

error_compare <- rbind(calcPredictionAccuracy(test_rating,/
                getData(eval_scheme, "unknown")),/
                       calcPredictionAccuracy(test_rating_2,/
                getData(eval_scheme, "unknown")))

rownames(error_compare) <- c("User Based CF","Item Based CF")

The comparative output shows that UBCF outperforms IBCF with lower values of RMSE, MSE, and MAE.

Model evaluation

Similarly, we can use the other algorithms available in recommenderlab to test/evaluate our models. We encourage the user to try out a few more and see which algorithm has the least error in predicted ratings.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset