lecture4

# lecture4 - Data Mining CS57300 Purdue University September...

This preview shows pages 1–10. Sign up to view the full content.

Data Mining CS57300 Purdue University September 7, 2010

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Populations and samples (cont)
Types of probability sampling • Simple random sampling • There is an equal probability of selecting any particular item • Sampling without replacement • As each item is selected, it is removed from the population • Sampling with replacement • Items are not removed from the population as they are selected for the sample; the same item can be picked up more than once • StratiFed sampling • Split the data into several partitions; then draw random samples from each partition Tan, Steinbach, Kumar. Introduction to Data Mining, 2004.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Tan, Steinbach, Kumar. Introduction to Data Mining, 2004. Sample size 500 Points 2000 Points 8000 Points
Tan, Steinbach, Kumar. Introduction to Data Mining, 2004. • What sample size is necessary to get at least one object from each of 10 groups? Sample size

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Statistical inference
Populations and samples • In data mining we often work with a sample of data from the population of interest Estimation techniques allow inferences about population properties from sample data • If we had the population we could calculate the properties of interest Population Sample Parameter: Beta = 0.546 Statistic: b = 0.692 Sampling Inference

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Statistical inference • Infer properties of an unknown distribution with sample data generated from that distribution • Parameter estimation • Infer the value of a population parameter based on a sample statistic (e.g., estimate the mean) • Hypothesis testing • Infer the answer to a question about a population parameter based on a sample statistic (e.g., is the mean non-zero?)
Example inference procedure

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 03/13/2012 for the course CS 573 taught by Professor Staff during the Fall '08 term at Purdue.

### Page1 / 30

lecture4 - Data Mining CS57300 Purdue University September...

This preview shows document pages 1 - 10. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online