Step 1 - Import packages, load, parse, and explore the movie and rating dataset

We will load, parse, and do some exploratory analysis. However, before that, let's import the necessary packages and libraries:

package com.packt.ScalaML.MovieRecommendation 
import org.apache.spark.sql.SparkSession 
import org.apache.spark.mllib.recommendation.ALS 
import org.apache.spark.mllib.recommendation.MatrixFactorizationModel 
import org.apache.spark.mllib.recommendation.Rating 
import scala.Tuple2 
import org.apache.spark.rdd.RDD

This code segment should return you the DataFrame of the ratings:

val ratigsFile = "data/ratings.csv"
val df1 = spark.read.format("com.databricks.spark.csv").option("header", true).load(ratigsFile)    
val ratingsDF = df1.select(df1.col("userId"), df1.col("movieId"), df1.col("rating"), df1.col("timestamp"))
ratingsDF.show(false)

The following code segment shows you the DataFrame of the movies:

val moviesFile = "data/movies.csv"
val df2 = spark.read.format("com.databricks.spark.csv").option("header", "true").load(moviesFile)
val moviesDF = df2.select(df2.col("movieId"), df2.col("title"), df2.col("genres"))

Table of Contents for Step 1 - Import packages, load, parse, and explore the movie and rating dataset

Create new playlist

Sign In

Sign Up

Table of Contents for
Step 1 - Import packages, load, parse, and explore the movie and rating dataset