For this half of the assignment, we will be using alcohol consumption (ie, the “ALC”
variable) as the outcome
, instead of the predictor. We are specifically interested in the effect
of age on alcohol consumption. Use the same dataset as in Part I.
9.
Based on your categorical “ALC” variable, create a dichotomous variable
“ALC_BIN” for light drinkers vs. moderate/heavy drinkers (ALC=0 vs. ALC=1, 2).
How many people are in each category of this new variable?
Alc_bin
Number
0
4052
1
5144
Show your SAS code.
See Appendix.
10.
Center age at 60 years old. Then run a logistic regression model for the dependent
variable “ALC_BIN” (light drinkers vs. moderate/heavy drinkers), adjusting for age
5

(continuous), male, and race. Use the first category as the reference group for all
categorical predictors.
Determine whether there is evidence to suggest that the effect of age
on dichotomous alcohol consumption is modified by sex.
No, since the test for whether β ≠ 0 for the interaction term between age and sex yields
p = 0.118, there is insufficient evidence to suggest that the effect of age on dichotomous alcohol
consumption is modified by sex.
Write down the model with the interaction term between sex and centered age, and
interpret the
odds ratio
of sex in the model.
Logit(p) = β
0
+ β
1
*(age-60) + β
2
*male + β
3
*black + β
4
*other + β
5
*(age – 60)(male)
Where p = probability of being a moderate/heavy drinker.
The odds ratio for male = e
0.8269
= 2.286, meaning that holding all other variables in the model
constant, males have 2.286 times the odds of being a moderate/heavy drinker compared to
females.
11.
Run a cumulative logistic regression for alcohol category (low vs. medium vs. high
drinkers) including the same variables as in the model of Question10. Write down both
models for clogit(ALC = j), and interpret
only
the coefficients that differ by level of the
categorical ALC outcome.
clogit[Pr(ALC = 1)] = logit [Pr(ALC < 1)] = β
0
+ β
1
*(age-60) + β
2
*male + β
3
*black + β
4
*other +
β
5
*(age – 60)(male)
clogit[Pr(ALC = 2)] = logit [Pr(ALC < 2)] = β
6
+ β
1
*(age-60) + β
2
*male + β
3
*black + β
4
*other +
β
5
*(age – 60)(male)
Only the intercepts differ. Here, β
0
= -0.2998, which means that a 60 year old white female
contributes -0.2998 to the log odds of being a moderate or heavy drinker compared to being a
light drinker.
Β
6
= -1.3987, which means that a 60 year old white female contributes -1.3987 to the log odds of
being a heavy drinker compared to being a moderate or light drinker.
6

12.
From the statistical perspective, decide whether to use the model in Question 10 with
the dichotomous “ALC_BIN” variable or to use the model in Question 11 with the three-
category variable. Justify the use of the statistical measure that you choose.
AIC for model with threshold = 11908.001.
AIC for model without 3 categories for alcohol = 18795.260.

#### You've reached the end of your free preview.

Want to read all 8 pages?

- Spring '14
- Hernandez-Diaz
- Logit