1introduction - 1 CS345A: Data Mining on the Web Course...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 CS345A: Data Mining on the Web Course Introduction Issues in Data Mining Bonferronis Principle 2 Course Staff Instructors : Anand Rajaraman Jeff Ullman Reach us as cs345a-win0809-staff @ lists.stanford.edu . More info on www.stanford.edu/class/cs345a . 3 Requirements Homework (Gradiance and other) 20% Go to www.gradiance.com/pearson Enter class code 83769DC9 . If you took CS145 or CS245 in the past year, you should have free access; otherwise you will have to purchase access from Pearson Ed. Project 40% Final Exam 40% 4 Project Software implementation related to course subject matter. Should involve an original component or experiment. More later about available data and computing resources. 5 Possible Projects Many past projects have dealt with collaborative filtering (advice based on what similar people do). E.g., Netflix Challenge . Others have dealt with engineering solutions to machine-learning problems. 6 ML-Replacement Projects ML generally requires a large training set of correctly classified data. Example : classifying Web pages by topic. Hard to find well-classified data. Exception : Open Directory works for page topics, because work is collaborative and shared by many. Other good exceptions? 7 ML-Replacement (2) Many problems require thought rather than ML: 1. Tell important pages from unimportant (PageRank). 2. Tell real news from publicity (how?)....
View Full Document

This note was uploaded on 09/17/2009 for the course IT it771 taught by Professor Jenisha during the Fall '09 term at University of Advancing Technology.

Page1 / 27

1introduction - 1 CS345A: Data Mining on the Web Course...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online