class_10_17

# class_10_17 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows pages 1–13. Sign up to view the full content.

Statistical Data Mining ORIE 474 Fall 2007 Tatiyana Apanasovich 10/17/07 Logistic Regression

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Why use logistic regression? There are many important research topics for which the dependent variable is "limited." Binary logistic regression is a type of regression analysis where the dependent variable is a dummy variable: coded 0 (did not vote) or 1(did vote)
The Linear Probability Model In the OLS regression: Y = a + b*X + e ; where Y = (0, 1) The error terms are heteroskedastic e is not normally distributed because Y takes on only two values The predicted probabilities can be greater than 1 or less than 0

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Q: EVAC Did you evacuate your home to go someplace safer before Hurricane Dennis (Floyd) hit? 1 YES 2 NO An Example: Hurricane Evacuations
The Data EVAC PETS MOBLHOME TENURE EDUC 0 1 0 16 16 0 1 0 26 12 0 1 1 11 13 1 1 1 1 10 1 0 0 5 12 0 0 0 34 12 0 0 0 3 14 0 1 0 3 16 0 1 0 10 12 0 0 0 2 18 0 0 0 2 12 0 1 0 25 16 1 1 1 20 12

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
LS Results Dependent Variable: EVAC Variable B t-value (Constant) 0.190 2.121 PETS -0.137 -5.296 MOBLHOME 0.337 8.963 TENURE -0.003 -2.973 EDUC 0.003 0.424 FLOYD 0.198 8.147 R 2 0.145 F-stat 36.010
Problems: Descriptive Statistics 1070 -.08498 .76027 .2429907 .1632 1070 Unstandardized Predicted Value Valid N (listwise) N Minimum Maximum Mean Std. Deviat Predicted Values outside the 0,1 range

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Heteroskedasticity TENURE 100 80 60 40 20 0 U n s t a n d a r d i z e d R e s i d u a l 10 0 -10 -20
The Logistic Regression Model The "logit" model solves these problems: ln[p/(1-p)] = α + β X + e p is the probability that the event Y occurs, p(Y=1) p/(1-p) is the "odds ratio" ln[p/(1-p)] is the log odds ratio, or "logit"

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
More: The logistic distribution constrains the estimated probabilities to lie between 0 and 1. The estimated probability is: p = 1/[1 + exp(- α - β X)] if you let + X =0, then p = .50 as + X gets really big, p approaches 1 as + X gets really small, p approaches 0

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Maximum Likelihood Estimation (MLE) MLE is a statistical method for estimating the coefficients of a model. The likelihood function (L) measures the probability of
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 12/23/2009 for the course ORIE 474 at Cornell.

### Page1 / 34

class_10_17 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online