52 esl chapter 4 linear methods for classication name

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ake them all; up to five times that number of controls is sufficient. See next slide 47 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Prospective vs retrospective sampling Exposure=0 Exposure=1 control (0) p00 p01 case (1) p10 p11 • β in logistic regression estimates prospective odds ratio ORp = p11 /p01 p10 /p00 But this equals the retrospective odds ratio: p11 /p10 ORr = p01 /p00 So logistic regression can be applied to either sampling scheme. 48 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani 0.07 0.06 0.05 0.04 Coefficient Variance 0.08 0.09 Simulation Theoretical 2 4 6 8 10 12 14 Sampling more controls than cases reduces the variance of the parameter estimates. But after a ratio of about 5 to 1 the variance reduction flattens out. Control/Case Ratio 49 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Risk estimates and classification ˆ • We can estimate the risk for a new observation x0 via η (x0 ) = xT β ˆ 0 ˆ ˆ ˆ and Pr(Y = 1|X = x0 ) = eη(x0 ) /(1 + eη(x0 ) ). • To obtain a 95% confidence interval for Pr(Y = 1|X = x0 ), we first ˆ obtain one for η (x0 ) (using the estimated covariance of β ). We then apply the sigmoid transformation to the lower and upper values. ˆ • To classify a new observation, we threshold Pr(Y = 1|X = x0 ) at 0.5. Other thresholds change the sensitivity and specificity, and are used to construct ROC curves. 50 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani “Receiver Operating Characteristics” or ROC curve 0.6 0.4 0.0 0.2 True positive rate 0.8 1.0 ROC Curve 0.0 0.2 0.4 0.6 0.8 1.0 False positive rate ˆ Sensitivity = P (G = 1|Y = 1) = True positive rate. ˆ Specificity = P (G = 0|Y = 0) = 1 - False positive rate. 51 ESL Chapter 4 — Linear Methods for Classification Trevor Hastie and Rob Tibshirani Predicted Class − or Null + or Non-null Total True − or Null True Neg. (TN) False Pos. (FP) N Class + or N...
View Full Document

This document was uploaded on 03/10/2014 for the course STATS 315A at Stanford.

Ask a homework question - tutors are online