Logistic regression
is useful for situations in which you want to be able to predict the presence or absence of a characteristic or
outcome based on values of a set of predictor variables. It is similar to a linear regression model but is suited to models where
the dependent variable is dichotomous. Logistic regression coefficients can be used to estimate odds ratios for each of the
independent variables in the model. Logistic regression is applicable to a broader range of research situations than discriminant
analysis.
Example.
What lifestyle characteristics are risk factors for coronary heart disease (CHD)? Given a sample of patients measured
on smoking status, diet, exercise, alcohol use, and CHD status, you could build a model using the four lifestyle variables to
predict the presence or absence of CHD in a sample of patients. The model can then be used to derive estimates of the odds
ratios for each factor to tell you, for example, how much more likely smokers are to develop CHD than nonsmokers.
Data.
The dependent variable should be dichotomous. Independent variables can be interval level or categorical; if categorical,
they should be dummy or indicator coded (there is an option in the procedure to recode categorical variables automatically).
Assumptions.
Logistic regression does not rely on distributional assumptions in the same sense that discriminant analysis
does. However, your solution may be more stable if your predictors have a multivariate normal distribution. Additionally, as
with other forms of regression, multicollinearity among the predictors can lead to biased estimates and inflated standard errors.
The procedure is most effective when group membership is a truly categorical variable; if group membership is based on values
of a continuous variable (for example, “high IQ” versus “low IQ”), you should consider using linear regression to take
advantage of the richer information offered by the continuous variable itself.
Logistic Regression Variable Selection Methods
Method selection allows you to specify how independent variables are entered into the analysis. Using different methods, you
can construct a variety of regression models from the same set of variables.
Enter.
A procedure for variable selection in which all variables in a block are entered in a single step.
Forward Selection (Conditional).
Stepwise selection method with entry testing based on the significance of the score
statistic, and removal testing based on the probability of a likelihoodratio statistic based on conditional parameter estimates.
Forward Selection (Likelihood Ratio).
Stepwise selection method with entry testing based on the significance of the score
statistic, and removal testing based on the probability of a likelihoodratio statistic based on the maximum partial likelihood
estimates.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '11
 ahmed
 Regression Analysis, SES

Click to edit the document details