introduction-2

introduction-2 - 1 CS345 --- Data Mining Introductions What...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 CS345 --- Data Mining Introductions What Is It? Cultures of Data Mining 2 Course Staff r Instructors: R Anand Rajaraman R Jeff Ullman r TA: R Jeff Klingner 3 Requirements r Homework (Gradiance and other) 20% R Gradiance class code DD984360 r Project 40% r Final Exam 40% 4 Project r Software implementation related to course subject matter. r Should involve an original component or experiment. r More later about available data and computing resources. 5 Team Projects r Working in pairs OK, but 1. We will expect more from a pair than from an individual. 2. The effort should be roughly evenly distributed. 6 What is Data Mining? r Discovery of useful, possibly unexpected, patterns in data. r Subsidiary issues: R Data cleansing : detection of bogus data. E.g., age = 150. Entity resolution. R Visualization : something better than megabyte files of output. R Warehousing of data (for retrieval). 7 Typical Kinds of Patterns 1. Decision trees : succinct ways to classify by testing properties. 2. Clusters : another succinct classification by similarity of properties. 3. Bayes models, hidden-Markov models , frequent-itemsets : expose important associations within data. 8 Example: Clusters x x x x x x x x x x x x x x x x xx x x x x x x x x x x x x x x x x x x x x x x x 9 Example : Frequent Itemsets r A common marketing problem: examine what people buy together to...
View Full Document

Page1 / 25

introduction-2 - 1 CS345 --- Data Mining Introductions What...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online