introduction-3

introduction-3 - 1 CS345A: Data Mining on the Web Course...

Info iconThis preview shows pages 1–9. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 CS345A: Data Mining on the Web Course Introduction Issues in Data Mining Bonferronis Principle 2 Course Staff Instructors : Anand Rajaraman Jeff Ullman TA : Babak Pahlavan 3 Requirements Homework (Gradiance and other) 20% Gradiance class code B0E9AA66 Note URL for class: www.gradiance.com/services (not /pearson). Project 40% Final Exam 40% 4 Project Software implementation related to course subject matter. Should involve an original component or experiment. More later about available data and computing resources. 5 Team Projects Working in pairs OK, but 1. We will expect more from a pair than from an individual. 2. The effort should be roughly evenly distributed. 6 What is Data Mining? Discovery of useful, possibly unexpected, patterns in data. Subsidiary issues: Data cleansing : detection of bogus data. E.g., age = 150. Entity resolution. Visualization : something better than megabyte files of output. Warehousing of data (for retrieval). 7 Cultures Databases : concentrate on large-scale (non-main-memory) data. AI (machine-learning): concentrate on complex methods, small data. Statistics : concentrate on models. 8 Models vs. Analytic Processing To a database person, data-mining is an extreme form of analytic processing -- queries that examine large amounts of data. Result is the data that answers the query....
View Full Document

Page1 / 27

introduction-3 - 1 CS345A: Data Mining on the Web Course...

This preview shows document pages 1 - 9. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online