Unformatted text preview: Logistic Regression Data Mining Prof. Dawn Woodard School of ORIE Cornell University 1 Outline 1 Announcements 2 Logistic Regression 2 Announcements Questions? Remember: the second (other) prelim exam is 12/02 (Wednesday, the second-to-last day of class) 4 Veterans Data Several columns of the veteran’s data: 6 Veterans Data Last time we fit a logistic regression model for TARGET.B, using training data To predict for a test data set, we calculated Pr ( Y = 1 | X 1 , . . . , X P ) If this probability was > . 5, we predicted Y = 1, and otherwise Y = We obtained a matrix of predicted vs. actual outcomes for a test data set: Predict Y = Predict Y = 1 Actually Y = 430 315 Actually Y = 1 327 416 7 Veterans Data Just like for naive Bayes, we can alter the classification threshold With threshold = 0.4: Predict Y = Predict Y = 1 Actually Y = 176 569 Actually Y = 1 118 625 With threshold = 0.6: Predict Y = Predict Y = 1 Actually Y = 629 116 Actually Y = 1 558 185 8 Veterans Data ROC curve: False Positive Rate True Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 9 Veterans Data Is our classifier better than the coin-ﬂipper?...
## This note was uploaded on 12/23/2009 for the course ORIE 4740 at Cornell.

