lecture4

lecture4 - Data Mining CS57300 Purdue University September...

Info iconThis preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining CS57300 Purdue University September 7, 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Populations and samples (cont)
Background image of page 2
Types of probability sampling • Simple random sampling • There is an equal probability of selecting any particular item • Sampling without replacement • As each item is selected, it is removed from the population • Sampling with replacement • Items are not removed from the population as they are selected for the sample; the same item can be picked up more than once • StratiFed sampling • Split the data into several partitions; then draw random samples from each partition Tan, Steinbach, Kumar. Introduction to Data Mining, 2004.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Tan, Steinbach, Kumar. Introduction to Data Mining, 2004. Sample size 500 Points 2000 Points 8000 Points
Background image of page 4
Tan, Steinbach, Kumar. Introduction to Data Mining, 2004. • What sample size is necessary to get at least one object from each of 10 groups? Sample size
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Statistical inference
Background image of page 6
Populations and samples • In data mining we often work with a sample of data from the population of interest Estimation techniques allow inferences about population properties from sample data • If we had the population we could calculate the properties of interest Population Sample Parameter: Beta = 0.546 Statistic: b = 0.692 Sampling Inference
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Statistical inference • Infer properties of an unknown distribution with sample data generated from that distribution • Parameter estimation • Infer the value of a population parameter based on a sample statistic (e.g., estimate the mean) • Hypothesis testing • Infer the answer to a question about a population parameter based on a sample statistic (e.g., is the mean non-zero?)
Background image of page 8
Example inference procedure
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/13/2012 for the course CS 573 taught by Professor Staff during the Fall '08 term at Purdue.

Page1 / 30

lecture4 - Data Mining CS57300 Purdue University September...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online