09-recsys

2 6 5 2 3 5 11 21 3 7 21 2 1 4 5 3 2 2 2 1 4 2 3 5 3 5

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 3 items 5 2 4 2 4 4 2 3 4 4 2 3 5 3 5 4 ? 1 4 1 5 4 1 3 5 4 ~ 2 2 3 3 2 2 5 4 users -.4 .2 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.5 .6 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.2 .3 .5 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 1.1 2.1 .3 -.7 2.1 -2 -1 ~ items .1 .7 .3 A rank-3 SVD approximation 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 23 [Bellkor Team] users 1 3 items 5 2 4 2 4 4 2 3 4 4 2 3 5 3 5 4 ? 1 4 1 5 4 1 3 5 4 ~ 2 2 3 3 2 2 5 4 users -.4 .2 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.5 .6 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.2 .3 .5 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 1.1 2.1 .3 -.7 2.1 -2 -1 ~ items .1 .7 .3 A rank-3 SVD approximation 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 24 [Bellkor Team] users 1 3 items 5 2 4 2 4 4 2 3 4 4 2 3 5 3 5 4 2.4 1 4 1 5 4 1 3 5 4 ~ 2 2 3 3 2 2 5 4 users -.4 .2 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.5 .6 .5 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 -.2 .3 .5 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 1.1 2.1 .3 -.7 2.1 -2 -1 ~ items .1 .7 .3 A rank-3 SVD approximation 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 25 [Bellkor Team] 1 3 5 2 4 2 4 1 4 4 1 5 3 4 2 4 3 4 4 2 1 3 5 4 5 .2 .6 .5 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 ~ -.4 -.5 3 2 2 2 .1 4 2 3 5 3 5 .7 .3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 Properties: SVD isn’t defined when entries are unknown use specialized methods Very powerful model can easily overfit Probably most popular model among contestants 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 26 [Bellkor Team] Want to minimize SSE for Test data One idea: Minimize SSE for Training data Want large d to capture all the signals But, Test RMSE begins to rise for d > 2 Regularization is needed Allow rich model where there are sufficient data Shrink aggressively where data are scarce 2 2 ∑ min training(rui − q...
View Full Document

This document was uploaded on 02/26/2014 for the course CS 246 at Stanford.

Ask a homework question - tutors are online