We are 95% confident that the interval covers the true β
–
If the interval does not cover 0, we are 95% confident the
true value of β is not zero
“^” (hat) stands for
estimated parameter
^
^

Example
•
Staessen et al. (1992) investigated the relationship between
lead exposure and kidney function.
•
Background
:
–
Heavy lead exposure can lead to kidney damage
–
Kidney function decreases with age
–
People accumulate small amounts of lead as they get older
•
Researchers wanted to know whether accumulation of lead
could explain some of the age-related decrease in kidney
function.
•
Researchers collected data on lead exposure, creatinine
clearance (measure of kidney function), age, and other
covariates on 965 men
38
Here, age is a
confounder

Example (cont’d)
39
Variable
Meaning
Units
X
1
Log
10
(serum
lead)
Serum led was in
࠵?
g/L
X
2
Age
Years
X
3
Body mass index
kg/m
2
X
4
Log(GGT)
Serum GGT was in U/L
X
5
Diuretics
1 = yes (took diuretics)
0 = no (never took diuretics)
Y
Creatinine clearance
ml per minute
Model
: Y
i
= α + β
1
X
i1
+ β
2
X
i2
+ β
3
X
i3
+ β
4
X
i4
+ β
5
X
i5
+
࠵?
i
Used log(serum lead) rather than serum lead, because they expected the
effect of lead to be multiplicative, i.e., a doubling of lead concentration
would lead to an equal effect on creatinine clearance, no matter the
starting value.

Example (cont’d)
40
•
Question:
after adjusting for effects of the other variables,
is there a linear relationship between the logarithm of lead
concentration and creatinine clearance?
•
Result:
Best-fit value of β
1
was -9.5 ml/min (95% CI: -18.1
to -0.9 ml/min)
–
I.e., after accounting for differences in other variables, an
increase in log(lead) of 1 unit is associated with a decrease
in creatinine clearance of 9.5 ml/min (95% CI: -18.1 to -0.9
ml/min)
–
Note: 1 unit increase in log(lead) means 10-fold increase in
lead concentration (since log base 10 was used)
•
Since the 95% CI does not include 0, we know that P<0.05.

R
2
: How well does the model fit the
data?
41
•
R
2
(coefficient of multiple determination)
- measures the
proportion of variability in the response variable that is
explained by the model (i.e., set of variables).
•
In the example, R
2
is 0.27.
•
This means that 27% of variation in creatinine clearance is
explained by the model.
The remaining 73% is explained
by other variables (not included in the study) and random
variation
•
Note:
can calculate
partial R
2
for
each variable that
quantifies the proportion of variation in the response
explained by each predictor after accounting for other
predictors

Intuitive Biostatistics
Harvey Motulsky
Copyright © 2018 Oxford University Press
Figure 37.2. The meaning of R
2
in multiple regression.
R
2
: How well does the model fit the data?
Scatter plot of measured creatinine clearance vs creatinine clearance
predicted by the model.
R
2
for the predicted and actual values is
identical the R
2
from the model.

R
2
vs adjusted R
2
43
•
R
2
is commonly used as a measure of fit in multiple
regression models.

#### You've reached the end of your free preview.

Want to read all 56 pages?

- Fall '19
- Regression Analysis, Harvey Motulsky