lecture11 - Data Mining CS57300 Purdue University October...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining CS57300 Purdue University October 5, 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Predictive modeling: evaluation
Background image of page 2
Empirical evaluation • Given observed accuracy of a model on limited data, how well does this estimate generalize for additional examples? • Given that one model outperforms another on some sample of data, how likely is it that this model is more accurate in general? • When data are limited, what is the best way to use the data to both learn and evaluate a model?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
How to evaluate a classifer? • Use score function to assess quality of predictions for a set of instances • Measures difference between the prediction of the model for an instance i and the true class label value of i • Common functions: • Zero-one loss • Squared loss
Background image of page 4
Evaluating classifers • Goal: Estimate true future error rate • When data are limited, what is the best way to use the data to both learn and evaluate a model? • Approach 1 • Reclassify training data to estimate error rate
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Approach 1 Y X1 X2 Data Set Model F(X) Y X1 X2 Data Set Score: 83% Typically produces a biased estimate of future error rate -- why?
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 03/13/2012 for the course CS 573 taught by Professor Staff during the Fall '08 term at Purdue University-West Lafayette.

Page1 / 19

lecture11 - Data Mining CS57300 Purdue University October...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online