ho_logistic - Newsom Data Analysis II Fall 2008 1 Logistic...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Newsom 1 Data Analysis II Fall 2008 Logistic Regression Overview: Logistic and OLS Regression Compared Logistic regression is an approach to prediction, like Ordinary Least Squares (OLS) regression. However, with logistic regression, the researcher is predicting a dichotomous outcome. This situation poses problems for the assumptions of OLS that the error variances (residuals) are normally distributed. Instead, they are more likely to follow a logistic distribution. When using the logistic distribution, we need to make an algebraic conversion to arrive at our usual linear regression equation (which we have written as Y = B 0 + B 1 X + e ). With logistic regression, there is no standardized solution printed. And to make things more complicated, the unstandardized solution does not have the same straight-forward interpretation as it does with OLS regression. One other difference between OLS and logistic regression is that there is no R 2 to gauge the variance accounted for in the overall model (at least not one that has been agreed upon by statisticians). Instead, a chi-square test is used to indicate how well the logistic regression model fits the data. Probability that Y = 1 Because the dependent variable is not a continuous one, the goal of logistic regression is a bit different, because we are predicting the likelihood that Y is equal to 1 (rather than 0) given certain values of X . That is, if X and Y have a positive linear relationship, the probability that a person will have a score of Y = 1 will increase as values of X increase. So, we are stuck with thinking about predicting probabilities rather than the scores of dependent variable. For example, we might try to predict whether or not small businesses will succeed or fail based on the number of years of experience the owner has in the field prior to starting the business. We presume that those people who have been selling widgets for many years who open their own widget business will be more likely to succeed. That means that as X (the number of years of experience) increases, the probability that Y will be equal to 1 (success in the new widget business) will tend to increase. If we take a hypothetical example, in which there were 50 small businesses studied and the owners have a range of years of experience from 0 to 20 years, we could represent this tendency to increase the probability that Y =1 with a graph. To illustrate this, it is convenient to break years of experience up into categories (i.e., 0-4, 5-8, 9-12, 13-16, 17-20), but logistic regression does not require this. If we compute the mean score on Y (averaging the 0s and 1s) for each category of years of experience, we will get something like: Yrs Exp Average Probability that Y=1 0-4 .17 .17 5-8 .40 .40 9-12 .50 .50 13-16 .56 .56 17-20 .96 .96
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Newsom 2 Data Analysis II Fall 2008 If we graph this, it looks like the following: Years Experience Category (X) 0-4 5-8 9-12 13-16 17-20 Average of Y (probability of success) .0 .5 1.0 Notice an S-shaped curve. This is typical when we are plotting the average (or expected) values of Y by different values of X
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 5

ho_logistic - Newsom Data Analysis II Fall 2008 1 Logistic...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online