Logistic Regression, Prediction and ROC

# Logistic Regression, Prediction and ROC - Log istic Reg r...

This preview shows pages 1–4. Sign up to view the full content.

2/17/2014 Logistic Regression, Prediction and ROC https://blackboard.uc.edu/bbcswebdav/pid-9566224-dt-content-rid-55868231_2/courses/14SS_BANA7046002/notes%284%29.html 1/15 Logistic Regression, Prediction and ROC The objective of this case is to get you understand logistic regression (binary classification) and some important ideas such as cross validation, ROC curve, cut-off probability. Code in this case is built upon lecture slides and Shaonan Tian's sample code. Input and sample data First load the credit scoring data. It is easy to load comma-separated values (CSV). credit.data <- read.csv("http://homepages.uc.edu/~maifg/7040/credit0.csv", header = T) Now split the data 90/10 as training/testing datasets: subset <- sample(nrow(credit.data), nrow(credit.data) * 0.9) credit.train = credit.data[subset, ] credit.test = credit.data[-subset, ] The training dataset has 63 variables, 4500 obs. colnames(credit.train)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2/17/2014 Logistic Regression, Prediction and ROC https://blackboard.uc.edu/bbcswebdav/pid-9566224-dt-content-rid-55868231_2/courses/14SS_BANA7046002/notes%284%29.html 2/15 ## [1] "id" "Y" "X2" "X3" "X4" "X5" "X6" ## [8] "X7" "X8" "X9" "X10_2" "X11_2" "X12_2" "X13_2" ## [15] "X14_2" "X15_2" "X15_3" "X15_4" "X15_5" "X15_6" "X16_2" ## [22] "X16_3" "X16_4" "X16_5" "X16_6" "X17_2" "X17_3" "X17_4" ## [29] "X17_5" "X17_6" "X18_2" "X18_3" "X18_4" "X18_5" "X18_6" ## [36] "X18_7" "X19_2" "X19_3" "X19_4" "X19_5" "X19_6" "X19_7" ## [43] "X19_8" "X19_9" "X19_10" "X20_2" "X20_3" "X20_4" "X21_2" ## [50] "X21_3" "X22_2" "X22_3" "X22_4" "X22_5" "X22_6" "X22_7" ## [57] "X22_8" "X22_9" "X22_10" "X22_11" "X23_2" "X23_3" "X24_2" Logistic Regression Let's build a logistic regression model based on all X variables. Note id is excluded from the model. credit.glm0 <- glm(Y ~ . - id, family = binomial, credit.train) You can view the result of the estimation: summary(credit.glm0) Note that there might be a warning message “ Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred ”. This happens because of a problem called quasi-complete separation. You can learn more here or here . The offending variable is X9 . We will choose to ignore the warning now. You can try fitting your model without X9 . The usual stepwise variable selection still works for logistic regression. caution: this will take a very long time . credit.glm.step <- step(credit.glm0) Or you can try model selection with BIC: credit.glm.step <- step(credit.glm0, k = log(nrow(credit.train)))
2/17/2014 Logistic Regression, Prediction and ROC https://blackboard.uc.edu/bbcswebdav/pid-9566224-dt-content-rid-55868231_2/courses/14SS_BANA7046002/notes%284%29.html 3/15 Prediction and Cross Validation Using Logistic Regression Now suppose there are 2 models we want to test, one with all X variables(credit.glm0), and one with X3, X8 and X11_2(credit.glm1). credit.glm1 <- glm(Y ~ X3 + X8 + X11_2, family = binomial, credit.train) AIC(credit.glm0) ## [1] 1713 AIC(credit.glm1) ## [1] 1891 BIC(credit.glm0) ## [1] 2110 BIC(credit.glm1) ## [1] 1916 Understanding classification decision making using logistic regression To get prediction from a logistic regression model, there are several steps you need to understand. Refer to textbook/slides for detailed math.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern