Logistic Regression in Stata
* Here's a description of how to do logistic regression, as well as ordinal & multinomial logit
regression, in Stata.
The examples use the UCLA-ATS data set
. For logistic
regression, for the outcome variable use the dummy variable
score>60); if necessary create it yourself. For ordinal logistic regression you’ll create the
, as described later on. For the multinomial logistic example we'll use
as the outcome variable.
We'll begin with logistic regression, then we’ll do brief
examples of ordinal & multinomial logit.
Rick Tardanico, March 2008.
* See ‘Explanatory variables in logistic regression.doc’
* Open & examine the univariate characteristics the data set
u hsb2, clear
hist read, norm
gr box read
* The dependent variable is
(science achievement score>=60).
numerically examine the pertinent univariate, bivariate & (insofar as possible)
multivariate distributions, consider possible transformations or other manipulations,
& perhaps save a new data set consisting of listwise (i.e. ‘complete’) observations
(using the commands ‘mark’ and ‘markout’ [see Long/Freese]).
* Select the explanatory variables
* The procedure we'll use, as outlined in Hosmer & Lemeshow,
independent indvariables based on, first, substantive & theoretical relevance, and second, on
pvalues<=.25 (which the procedure later modifies in view of more complete sets of variables
and the modelling of nonlinearities).
Begin by testing the potential explanatory variables with
the dependent variable in 'mini' logit & logistic models.
* Note: 'logit' yields the logit coefficient, while 'logistic' yields odds ratios. Specifying the
option 'or' after logit makes logit display odds ratios.
The Stata manual emphasizes that the
only distinction between 'logit' & 'logistic' is logit coefficients versus odds ratios.
conclusions reached by the two approaches are identical.
ci hsci, binomial
[options: agresti, jeffreys, wilson]
scatlog hsci math, ci
logit hsci math, or nolog
estimates store full
logit hsci, or