Chapter 8. Use Case – Similarity-based Recommendation System

In the previous chapters, we have studied about how to work with different aspects of the incredible graph database, Neo4j, from its installation, querying, and traversals, to performance optimizations at the production level. We have also had a peek under the hood of Neo4j in order to understand its functionality. Neo4j has a wide range of practical applications. Typically, any scenario that includes connected data represented graphically, Neo4j proves to be the perfect resource for storage and processing needs. With rapidly increasing connected devices and sensor driven technology, graph-based analytics solutions are becoming more popular in the business world, especially because they are simpler to interpret and visualize. Graph databases like Neo4j find extensive use in the route generation, fraud analysis, and impact analysis in networks. However, the latest and most popular use case of graph technologies is in the realm of recommendations. The booming sectors like social networks, job portfolio websites, and e-commerce solutions all operate with a sound recommendation engine at the backend. In this chapter let us understand how we can use Neo4j to address the issues in recommendation engines. The following topics will be discussed:

  • Recommendation engine basics
  • Building a recommendation engine
  • Addressing map recommendation issues and visualization

The why and how of recommendations

In the consumer specific markets today, businesses need to stay a step ahead of the customer in order to flourish. Recommendations use the data that the customers generate, so that you can analyze patterns and behavior that can be used to suggest products, people, or point of sales. Over the years, several techniques have been developed to generate recommendations. Let us take a look at the major approaches:

Collaborative filtering

It is one of the most common techniques that recommendation engines today are based on. Collaborative Filtering refers to the method of pattern or information filtering with collaboration between various data sources, viewpoints, agents, etc. It uses previous or historical data of a user, or other users, to profile a pattern, and then uses it to predict what other content the user might like.

Let us take an example to understand this. On e-commerce websites, you are presented with suggestions for products that you might buy, based on the search history of your and other's profiles. Basically, based on the common data between you and other users, the websites can suggest products which the other people have browsed, but you haven't, yet. A similar scenario is applicable for social networks or dating websites. As an end user, when you have followed, befriended or expressed an interest in some people you would like to date, a system using collaborative filtering can give you suggestions for people who match your taste. The priority of the suggested results will depend upon what the results have in common with you, and what their tastes are.

In this way, the activities of other users is analyzed by the recommender system, thereby saving you precious time of browsing through profiles of irrelevant people. This is an extremely powerful system as it lets you benefit from the activities of people that you do not know or have never met. Thus collaborative filtering gives you a way to obtain concrete insights for applications that generate large sets of data.

Content-based filtering

Content-based filtering uses items or target descriptions along with the profile of the user's preferences. In such a system, each item is described and associated with certain keywords, and for each user, profiling is done for what the user likes. So the recommender system suggests items depending upon the user's own historical activities in order to recommend the items that best match.

Content-based filtering also uses weight values to signify how important the feature is for a particular user, and are calculated from the content rated by the user. User feedback, usually with likes, votes, or ratings, decide how important an attribute is for that user. An issue that content based recommendation faces, is being able to recommend content of multiple types based on patterns obtained from other types. For example, it is an easy task to recommend news based on the news browsing patterns of the user. However, it is a challenge to recommend products, forums, videos or music, based on the news browsing patterns. Pandora Radio is an excellent example of content based recommender, which uses the initial song of the user to find songs with similar characteristics.

The hybrid approach

We have seen the collaborative and content-based filtering methods, but research suggests that a hybrid of the two methods proves to be more effective. One way of generating hybrid results is to first separately get the results from the two methods and then combine them. You can also create a single model incorporating both techniques. Studies suggest that performance of the hybrid recommender systems is empirically better than that of the pure systems, and is known to give more accurate recommendations.

Netflix, the media content delivery website, is an excellent example of a hybrid system user. It compares the searching and viewing patterns of similar users and also provides results for movies that have some common characteristics of the users' high rated contents.

Let us take a look at some of the different techniques in which a hybrid system can be generated and used:

  • Switching: This technique selects one or some of the recommendation components and applies it
  • Weighted: In this technique, the numerical scores of the recommendation components are combined
  • Mixed: Here, multiple recommendation systems operate together and the results from them are combined together
  • Augmentation of features: A method is initially used to generate the feature set to be used and this set is then passed on the next method for providing recommendations
  • Combination of features: A single recommendation system uses features from multiple different sources to generate results.
  • Cascading: It is a priority based technique in which different recommendation systems have different priorities, which are used to settle ties in results
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset