06-projects - CS345a: Data Mining j Jure Leskovec and Anand...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
CS345a: Data Mining Jure Leskovec and Anand Rajaraman Stanford University
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
iday 5:30 at Gates 12 5:30 :30pm Friday 5:30 at Gates B12 5:30 7:30pm You will learn and get hands on experience on: gin to Amazon EC2 and request a cluster Login to Amazon EC2 and request a cluster Run Hadoop MapReduce jobs Use Aster nCluster software Amazon have us $12k of computing time ch students has about $200 worth of Each students has about $200 worth of computing time 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 2
Background image of page 2
eally teams of 2 students (1 (3) is also ok) Ideally teams of 2 students (1 (3) is also ok) Project: iscovers interesting relationships within a Discovers interesting relationships within a significant amount of data ave some original idea that extends/builds on Have some original idea that extends/builds on what we learned in class xtend/Improve/Speed p me existing algorithm Extend/Improve/Speed up some existing algorithm Define a new problem and solve it 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 3
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
nswer the following questions: Answer the following questions: What is the problem you are solving? h t t ill (h ill t it)? What data will you use (where will you get it)? How will you do it? hi h i t h / t h i l t ? Which algorithms/techniques you plan to use? Be as specific as you can! ho will you valuate easure success? Who will you evaluate , measure success? What do you expect to submit at the end of the uarter? quarter? 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 4
Background image of page 4
ue on midnight Feb 1 2010 Due on midnight Feb 1 2010 Email the PDF to cs345a win0910 [email protected] TAs will assign group numbers ame your file: group#> proposal pdf Name your file: <group#>_proposal.pdf 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 5
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ikipedia Wikipedia IM buddy graph hoo ltavista eb graph Yahoo Altavista web graph Stanford WebBase itter Data Twitter Data Blogs and news data etflix Netflix Restaurant reviews hoo Music Ratings Yahoo Music Ratings 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 6
Background image of page 6
omplete edit history of Wikipedia until Complete edit history of Wikipedia until January 2008 r very single edit e complete For every single edit the complete snapshot of the article is saved ch age as a lk age: Each page has a talk page: 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 7
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
lk page: Talk page: itors discuss things like: Editors discuss things like: 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 8
Background image of page 8
ery registered Every registered use has a page: 1/21/2010 Jure Leskovec & Anand Rajaraman, Stanford CS345a: Data Mining 9
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ery user’s page has a talk page:
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/07/2011 for the course CS 512 taught by Professor Cube during the Spring '11 term at Central Texas College.

Page1 / 29

06-projects - CS345a: Data Mining j Jure Leskovec and Anand...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online