Chapter 8. Classifications, Recommendations, and Finding Relationships

In this chapter, we will cover:

  • Content-based recommendations
  • Hierarchical clustering
  • Clustering a Amazon sales dataset
  • Collaborative filtering-based recommendations
  • Classification using Naive Bayes Classifier
  • Assigning advertisements to keywords using the Adwords balance algorithm

Introduction

This chapter discusses how we can use Hadoop for more complex use cases such as classifying a dataset, making recommendations, or finding relationships between items.

A few instances of such scenarios are as follows:

  • Recommending products to users either based on the similarities between the products (for example, if a user liked a book about history, he might like another book on the same subject) or based on the user behaviour patterns (for example, if two users are similar, they might like books that each to the other has read).
  • Clustering a data set to identify similar entities. For example, identifying users with similar interests.
  • Classifying data into several groups learning from the historical data.

In this recipe, we will apply these and other techniques using MapReduce. For recipes in this chapter, we will use the Amazon product co-purchasing network metadata dataset available from http://snap.stanford.edu/data/amazon-meta.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset