09-recsys

33020326 222011 jure leskovec stanford c246 mining

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 BellKor: 0.8693 k-NN Grand Prize: 0.8563 Inherent noise: ???? 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets accurate 16 [Bellkor Team] Define a similarity measure between items: sij Select neighbors -- N(i;u): items most similar to i, that were rated by u 3. Estimate unknown rating, rui, as the weighted average: 1. 2. r= bui ui ∑ + sij ( ruj − buj ) j∈N ( i ;u ) ∑ sij j∈N ( i ;u ) baseline estimate for rui 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 17 [Bellkor Team] Use a weighted sum rather than a weighted average: rui =;u ) wij ( ruj − buj ) bui + ∑ j∈N ( i (Allow ∑ wij ≠ 1) j∈N ( i ;u ) • Model relationships between item i and its neighbors • Can be learnt through a least squares problem from all other users that rated i: ( Min w ∑ v ≠u ( rvi − bvi ) − ∑ j∈N ( i ;u ) wij ( rvj − bvj ) 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets ) 2 18 [Bellkor Team] ( Min w ∑ v ≠u ( rvi − bvi ) − ∑ j∈N ( i ;u ) wij ( rvj − bvj ) • Interpolation weights derived based on their role; no use of an arbitrary similarity measure • Explicitly account for interrelationships among the neighbors ) 2 Mostly unknown Challenges: • Deal with missing values • Avoid overfitting • Efficient implementation 2/2/2011 Estimate inner-products among movie ratings Jure Leskovec, Stanford C246: Mining Massive Datasets 19 [Bellkor Team] serious Braveheart The Color Purple Geared towards females Amadeus Sense and Sensibility Ocean’s 11 Lethal Weapon Geared towards males Dave The Lion King The Princess Diaries Independence Day Dumb and Dumber Gus escapist 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 20 Koren, Bell, Volinksy, IEEE Computer, 2009 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 21 [Bellkor Team] Recap: SVD 1 2 4 2 4 1 4 4 1 5 3 4 2 4 3 4 4 2 1 3 5 U 4 5 .2 .6 .5 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 ~ -.4 -.5 3 2 2 2 .1 4 2 3 5 3 5 .7 VT Σ m SVD on Netflix data: A = PQ 3 5 ≈ A m n n .3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 A rank-3 SVD approximation 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 22 [Bellkor Team] users 1...
View Full Document

Ask a homework question - tutors are online