08-recsys

D be the set of k users most similar to c who have

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: d in D sim(c,d)) Other options? Many tricks possible… 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 22 Expensive step is finding k most similar customers O(|U|) Too expensive to do at runtime Could pre-compute Naïve precomputation takes time O(N|U|) Stay tuned for how to do it faster! Can use clustering, partitioning as alternatives, but quality degrades 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 23 So far: User-user collaborative filtering Another view For item s, find other similar items Estimate rating for item based on ratings for similar items Can use same similarity metrics and prediction functions as in user-user model In practice, it has been observed that itemitem often works better than user-user 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 24 Avatar Alice LOTR 1 0.5 0.2 0.3 1 David Pirates 0.2 Bob Carol Matrix 0.4 What do we recommend for Avatar? cos(Avatar, Matrix) =0.38 cos(Avatar, Lotr) =0.0 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 25 Works for any kind of item No feature selection needed Cold Start: Sparsity: First Rater: Popularity Bias: Need enough users in the system to find a match. The user/ratings matrix is sparse. Hard to find users that have rated the same items. Cannot recommend an item that has not been previously rated. New items, Esoteric items Cannot recommend items to someone with unique tastes. Tends to recommend popular items. 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 2...
View Full Document

Ask a homework question - tutors are online