09-recsys

bob bell chris volinsky att bigchaos michael jahrer

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: eskovec, Stanford C246: Mining Massive Datasets 39 Labels known publicly Labels only known to Netflix Training Data Held-Out Data 3 million ratings 100 million ratings 1.5m ratings Quiz Set: scores posted on leaderboard 2/2/2011 1.5m ratings Test Set: scores known only to Netflix Jure Leskovec, Stanford C246: Mining Massive Datasets Scores used in determining final winner 40 Submissions limited to 1 a day So only 1 final submission could be made by either team in the last 24 hours 24 hours before deadline… BellKor team member in Austria notices (by chance) that Ensemble posts a score that is slightly better than BellKor’s Leaderboard score disappears after a few minutes (rule loophole) Frantic last 24 hours for both teams Much computer time on final optimization run times carefully calibrated to end about an hour before deadline Final submissions BellKor submits a little early (on purpose), 40 mins before deadline Ensemble submits their final entry 20 mins later ….and everyone waits…. 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 41 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 42 Million Dollars Awarded Sept 21st 2009 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 43 Most slides and plots borrowed from Yehuda Koren, Robert Bell and Padhraic Smyth 2/2/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 44...
View Full Document

This document was uploaded on 02/26/2014 for the course CS 246 at Stanford.

Ask a homework question - tutors are online