Collaborative filtering recommendation system

Recommendation systems are used everywhere in the Internet today. From e-commerce sites to restaurants, hotels, tickets, events, and so on. are recommended to us everywhere. Have we ever asked ourselves how do they know what will be the best for us? How do they come up with this calculation of showing the items we might like? The answer is most sites use collaborative filtering (CF) to recommend something. Collaborative filtering is a process of making automatic prediction (filtering) about the interests of a user by analyzing other user's choices or preferences (collaborative). We will build a simple recommendation system using the Pearson correlation method where a similarity score between two people is calculated on the range of -1 to +1. If the similarity score is +1, then it means two people are a perfect match. If the similarity score is 0, then it means no there is no similarity between them, and if the score is -1, then they are negatively similar. Usually, the scores are mostly fractional.

The Pearson correlation is calculated using the following formula:

Here, x denotes preferences from person one, y represents preferences from person two, and N represents the number of items in the preferences, which are common between x and y. Let's now implement a sample review system for restaurants in Dhaka. There are reviewers who have reviewed some restaurants. Some of them are common, some are not. Our job will be to find a recommendation for person X based on the reviews of others. Our reviews look like this:

$reviews = []; 
$reviews['Adiyan'] = ["McDonalds" => 5, "KFC" => 5, "Pizza Hut" => 4.5, "Burger King" => 4.7, "American Burger" => 3.5, "Pizza Roma" => 2.5];
$reviews['Mikhael'] = ["McDonalds" => 3, "KFC" => 4, "Pizza Hut" => 3.5, "Burger King" => 4, "American Burger" => 4, "Jafran" => 4];
$reviews['Zayeed'] = ["McDonalds" => 5, "KFC" => 4, "Pizza Hut" => 2.5, "Burger King" => 4.5, "American Burger" => 3.5, "Sbarro" => 2];
$reviews['Arush'] = ["KFC" => 4.5, "Pizza Hut" => 3, "Burger King" => 4, "American Burger" => 3, "Jafran" => 2.5, "FFC" => 3.5];
$reviews['Tajwar'] = ["Burger King" => 3, "American Burger" => 2, "KFC" => 2.5, "Pizza Hut" => 3, "Pizza Roma" => 2.5, "FFC" => 3];
$reviews['Aayan'] = [ "KFC" => 5, "Pizza Hut" => 4, "Pizza Roma" => 4.5, "FFC" => 4];

Now, based on the structure, we can write our Pearson correlation calculation between two reviewers. Here is the implementation:

function pearsonScore(array $reviews, string $person1, string $person2): float { 

$commonItems = array();

foreach ($reviews[$person1] as $restaurant1 => $rating) {
foreach ($reviews[$person2] as $restaurant2 => $rating) {
if ($restaurant1 == $restaurant2) {
$commonItems[$restaurant1] = 1;
}
}
}

$n = count($commonItems);

if ($n == 0)
return 0.0;

$sum1 = 0;
$sum2 = 0;
$sqrSum1 = 0;
$sqrSum2 = 0;
$pSum = 0;
foreach ($commonItems as $restaurant => $common) {
$sum1 += $reviews[$person1][$restaurant];
$sum2 += $reviews[$person2][$restaurant];
$sqrSum1 += $reviews[$person1][$restaurant] ** 2;
$sqrSum2 += $reviews[$person2][$restaurant] ** 2;
$pSum += $reviews[$person1][$restaurant] *
$reviews[$person2][$restaurant];
}

$num = $pSum - (($sum1 * $sum2) / $n);
$den = sqrt(($sqrSum1 - (($sum1 ** 2) / $n))
* ($sqrSum2 - (($sum2 ** 2) / $n)));

if ($den == 0) {
$pearsonCorrelation = 0;
} else {
$pearsonCorrelation = $num / $den;
}

return (float) $pearsonCorrelation;
}

Here, we have just implemented the equation we have shown for the Pearson correlation calculator. Now, we will write the recommendation function based on Pearson scoring:

function getRecommendations(array $reviews, string $person): array { 
$calculation = [];
foreach ($reviews as $reviewer => $restaurants) {
$similarityScore = pearsonScore($reviews, $person, $reviewer);
if ($person == $reviewer || $similarityScore <= 0) {
continue;
}

foreach ($restaurants as $restaurant => $rating) {
if (!array_key_exists($restaurant, $reviews[$person])) {
if (!array_key_exists($restaurant, $calculation)) {
$calculation[$restaurant] = [];
$calculation[$restaurant]['Total'] = 0;
$calculation[$restaurant]['SimilarityTotal'] = 0;
}

$calculation[$restaurant]['Total'] += $similarityScore *
$rating;
$calculation[$restaurant]['SimilarityTotal'] +=
$similarityScore;
}
}
}

$recommendations = [];
foreach ($calculation as $restaurant => $values) {
$recommendations[$restaurant] = $calculation[$restaurant]['Total']
/ $calculation[$restaurant]['SimilarityTotal'];
}

arsort($recommendations);
return $recommendations;
}

In the preceding function, we calculated the similarity score between each reviewer and weighted their reviews with each other. Based on the top score, we showed the recommendation for the reviewer. Let's run the following code to get some recommendations:

$person = 'Arush'; 
echo 'Restaurant recommendations for ' . $person . " ";
$recommendations = getRecommendations($reviews, $person);
foreach ($recommendations as $restaturant => $score) {
echo $restaturant . " ";
}

This will produce the following output:

Restaurant recommendations for Arush
McDonalds
Pizza Roma
Sbarro

We can use the Pearson correlation scoring system to recommend items or show users who to follow to get better reviews. There are many other ways to get the collaborative filtering to work, but that is beyond the scope of this book.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset