We are going to use the moviedata folder in S3:
s3a://sparkcookbook/moviedata
We are going to add some personalized ratings to this database so that we can test the accuracy of the recommendations.
You can look at movie.ratings to pick some movies and rate them. Feel free to choose the movies you would like to rate and provide your own ratings. The following are some movies I chose, alongside my ratings:
Movie ID | Movie name | Rating (1-5) |
1721 | Titanic | 5 |
10 | GoldenEye | 3 |
1 | Toy Story | 1 |
225 | Disclosure | 4 |
344 | Ace Ventura: Pet Detective | 4 |
480 | Jurassic Park | 5 |
589 | Terminator 2 | 5 |
780 | Independence Day | 4 |
1049 | The Ghost and the Darkness | 4 |
The highest user ID is 138493, so we are going to add the new user as 138494 to associate with these ratings further down in the recipe.