08-recsys

5 02 03 1 david 1302011 pirates 02 bob carol matrix

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: re Leskovec, Stanford C246: Mining Massive Datasets 9 Gathering “known” ratings for matrix Extrapolate unknown ratings from known ratings Mainly interested in high unknown ratings Evaluating extrapolation methods 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 10 Explicit Ask people to rate items Doesn’t work well in practice – people can’t be bothered Implicit Learn ratings from user actions e.g., purchase implies high rating What about low ratings? 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 11 Key problem: matrix U is sparse most people have not rated most items Cold start: new items have no ratings Three approaches Content-based Collaborative Hybrid 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 12 Main idea: Recommend items to customer C similar to previous items rated highly by C Movie recommendations recommend movies with same actor(s), director, genre, … Websites, blogs, news recommend other sites with “similar” content 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 13 Item profiles likes build recommend match Red Circles Triangles User profile 1/30/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 14 For each item, create an item profile Profile is a set of features movies: author, title, actor, director,… text: set of “important” words in document How to pick important words? Usual heuristic is TF.IDF (Term Frequency times Inverse Doc Frequency) 1/30/2011 Jure Leskovec, Stanford C246: Min...
View Full Document

This document was uploaded on 02/26/2014 for the course CS 246 at Stanford.

Ask a homework question - tutors are online