lecture12

lecture12 - Data Mining CS57300 Purdue University October...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Mining CS57300 Purdue University October 7, 2010
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Comparing algorithms
Background image of page 2
Score functions • Zero-one loss • Accuracy • Sensitivity/specifcity • Precision/Recall/F1 • Absolute loss • Squared loss • Root mean-squared error • Likelihood/conditional likelihood • Area under the ROC curve ) Predicted Actual TN FN FP TP + +
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ROC curves • Receiver Operating Characteristic curve • Plots the true positive rate against the false positive rate for different classiFcation thresholds 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 FP Rate TP Rate Base RPT
Background image of page 4
How to compute ROC curve P(Y) True class 0.51 + 0.07 - 0.84 + 0.94 + 0.67 + 0.58 - 0.10 - 0.42 + 0.16 - 0.94 0.84 + 0.67 0.58 - 0.51 0.42 + 0.16 0.10 - 0.07 P(Y) True class Predict class 0.94 + + 0.84 + - 0.67 + - 0.58 - - 0.51 + - 0.42 + - 0.16 - - 0.10 - - 0.07 - - TPR = 1/5 FPR = 0/4 P(Y) True class Predict class 0.94 + + 0.84 + + 0.67 + - 0.58 - - 0.51 + - 0.42 + - 0.16 - - 0.10 - - 0.07 - - TPR = 3/5 FPR = 0/4 P(Y) True class Predict class 0.94 + + 0.84 + + 0.67 + + 0.58 - - 0.51 + - 0.42 + - 0.16 - - 0.10 - - 0.07 - - TPR = 2/5 FPR = 1/4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
ROC curves • Evaluates performance over varying costs and class distributions • Can summarize with area under the curve (AUC) • AUC of 0.5 is random • AUC of 1.0 is perfect 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 FP Rate TP Rate Base RPT
Background image of page 6
Statistical questions in machine learning (Dietterich ’98) Single domain Multiple domains Analyze classifers Analyze algorithms Predict algorithm accuracy Choose between algorithms
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 25

lecture12 - Data Mining CS57300 Purdue University October...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online