Spark MLlib

MLlib ALS algorithm takes the training data of the RDD type, that is, Distributed Datasets [Rating] and trains a model, which is a MatrixFactorizationModel object.

RDD is a special data type supported by Spark. The RDD format is immutable, and they run on clusters and can operate in Parallel. One can perform on an RDD class.

The technique we are using here is known as . Let's assume User A likes Product A, Product B, and Product C and rated them with a score. Then, let's assume User B likes Product B, Product C, and Product D and gave a similar rating to the score User A gave for Product B and Product C. Now, using Collaborative Filtering, one can find out what User A would rate for Product D or what User B would rate for Product A as we have some commonality between User A and User B--they both rated Product B and Product C similarly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset