l81 - Logistic Regression, Part I: Problems with the Linear...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Logistic Regression I: Problems with the LPM —Page 1 Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) [This handout steals heavily from Linear probability, logit, and probit models, by John Aldrich and Forrest Nelson, paper # 45 in the Sage series on Quantitative Applications in the Social Sciences.] I NTRODUCTION . We are often interested in qualitative dependent variables: Voting (does or does not vote) Marital status (married or not) Fertility (have children or not) Immigration attitudes (opposes immigration or supports it) In the next few handouts, we will examine different techniques for analyzing qualitative dependent variables; in particular, dichotomous dependent variables. We will first examine the problems with using OLS, and then present logistic regression as a more desirable alternative. OLS AND DICHOTOMOUS DEPENDENT VARIABLES . While estimates derived from regression analysis may be robust against violations of some assumptions, other assumptions are crucial, and violations of them can lead to unreasonable estimates. Such is often the case when the dependent variable is a qualitative measure rather than a continuous, interval measure. If OLS Regression is done with a qualitative dependent variable it may seriously misestimate the magnitude of the effects of IVs all of the standard statistical inferences (e.g. hypothesis tests, construction of confidence intervals) are unjustified regression estimates will be highly sensitive to the range of particular values observed (thus making extrapolations or forecasts beyond the range of the data especially unjustified) OLS REGRESSION AND THE L INEAR P ROBABILITY M ODEL (LPM). The regression model places no restrictions on the values that the independent variables take on. They may be continuous, interval level (net worth of a company), they may be only positive or zero (percent of vote a party received) or they may be dichotomous (dummy) variable (1 = male, 0 = female). The dependent variable, however, is assumed to be continuous. Because there are no restrictions on the IVs, the DVs must be free to range in value from negative infinity to positive infinity. In practice, only a small range of Y values will be observed. Since it is also the case that only a small range of X values will be observed, the assumption of continuous, interval measurement is
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Logistic Regression I: Problems with the LPM —Page 2 usually not problematic. That is, even though regression assumes that Y can range from negative infinity to positive infinity, it usually won’t be too much of a disaster if, say, it really only ranges from 1 to 17. However, it does become a problem when Y can only take on 2 values, say, 0 and 1. If Y can only equal 0 or 1, then E(Yi) = 1 * P(Yi = 1) + 0 * P(Yi = 0) = P(Yi = 1).
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/29/2012 for the course SOC 63993 taught by Professor Richardwilliams during the Spring '11 term at Notre Dame.

Page1 / 10

l81 - Logistic Regression, Part I: Problems with the Linear...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online