Chapter 2. Introducing recommenders
Listing 2.1. Recommender input file, intro.csv
Listing 2.2. A simple user-based recommender program with Mahout
Listing 2.3. Configuring and running an evaluation of a recommender
Listing 2.4. Configuring and running a precision and recall evaluation
Listing 2.5. User 5’s preferences in the test data set
Listing 2.6. Changing the evaluation program to run a SlopeOneRecommender
Chapter 3. Representing recommender data
Listing 3.1. Setting preference values in a PreferenceArray
Listing 3.2. Defining input data programmatically with GenericDataModel
Listing 3.3. Triggering a refresh of a recommender system
Listing 3.4. Configuring a JNDI DataSource in Tomcat
Listing 3.5. Configuring a DataSource programmatically
Listing 3.6. Creating and evaluating with Boolean data
Listing 3.7. Evaluating precision and recall with Boolean data
Chapter 4. Making recommendations
Listing 4.1. Revisiting a simple user-based recommender system
Listing 4.2. Updating listing 4.1 to use a custom DataModel for GroupLens
Listing 4.3. Running an evaluation on the simple recommender
Listing 4.4. A simple recommender input file
Listing 4.5. Employing caching with a UserSimilarity implementation
Listing 4.6. The core of a basic item-based recommender
Listing 4.7. Selecting no weighting with a SlopeOneRecommender
Listing 4.8. Creating a JDBC-backed DiffStorage
Chapter 5. Taking recommenders to production
Listing 5.1. Limiting memory consumed by MemoryDiffStorage
Listing 5.2. A gender-based item similarity metric
Listing 5.3. Example IDRescorer that omits out-of-stock books and boosts a genre
Listing 5.4. Gender-based rescoring implementation
Listing 5.5. Complete recommender implementation for Líbímseti
Chapter 6. Distributing recommendation computations
Listing 6.1. A mapper that parses Wikipedia link files
Listing 6.2. Reducer which produces Vectors from a user’s item preferences
Listing 6.3. Mapper component of co-occurrence computation
Listing 6.4. Reducer component of co-occurrence computation
Listing 6.5. Wrapping co-occurrence columns
Listing 6.6. Splitting user vectors
Listing 6.7. Computing partial recommendation vectors
Chapter 7. Introduction to clustering
Chapter 8. Representing data
Chapter 9. Clustering algorithms in Mahout
Listing 9.1. In-memory clustering example using the k-means algorithm
Listing 9.2. The k-means clustering job entry point
Listing 9.3. In-memory example of the canopy generation algorithm
Listing 9.4. News clustering using canopy generation and k-means clustering
Listing 9.5. A custom Lucene analyzer that filters non-alphabetic tokens
Listing 9.6. In-memory example of fuzzy k-means clustering
Chapter 10. Evaluating and improving clustering quality
Listing 10.1. Top 10 terms using k-means clustering with a Euclidean distance measure
Listing 10.2. Top 10 terms using k-means clustering and a cosine distance measure
Listing 10.3. Calculating inter-cluster distances
Listing 10.4. A custom Lucene Analyzer that wraps around StandardTokenizer
Listing 10.5. MyAnalyzer with lowercase filter
Listing 10.6. A custom Lucene Analyzer using multiple filters
Listing 10.7. Modifying NewsKMeansClustering.java to use MyAnalyzer
Chapter 12. Real-world applications of clustering
Listing 12.1. Group-by-field mapper
Listing 12.2. Group-by-field reducer
Listing 12.3. Lucence Analyzer class optimized for Tweets
Listing 12.4. Output artists from the data set
Listing 12.5. Grouping by artist and outputting a unique key-value pair for an artist
Listing 12.6. Using the unique artist names to create a unique integer ID for each artist
Listing 12.7. Using integer ID mapping of the artist to convert tags into a vector
Listing 12.8. Grouping partial vectors of tags and summing them up into full vectors
Chapter 14. Training a classifier
Chapter 15. Evaluating and tuning a classifier
Listing 15.1. Passing data for AUC metric classes
Listing 15.2. Building a confusion matrix
Listing 15.3. A complete program to train a model with progress diagnostics
Chapter 16. Deploying a classifier
Listing 16.1. Code to parse and encode CSV data
Listing 16.2. Code for byte-level CSV parsing
Listing 16.3. Direct value encoding
Listing 16.4. Main program for the classification server
Listing 16.5. The watcher object loads the model and sets model status
Chapter 17. Case study: Shop It To Me
Listing 17.1. Sample encoder showing one implementation of an interaction encoder
Listing 17.2. Using caching and partial model evaluation to speed up Item selection
Appendix B. Mahout math
Listing B.1. Computing Vector operations efficiently