We are 95% confident that the interval covers the true β – If the interval does not cover 0, we are 95% confident the true value of β is not zero “^” (hat) stands for estimated parameter ^ ^
Example • Staessen et al. (1992) investigated the relationship between lead exposure and kidney function. • Background : – Heavy lead exposure can lead to kidney damage – Kidney function decreases with age – People accumulate small amounts of lead as they get older • Researchers wanted to know whether accumulation of lead could explain some of the age-related decrease in kidney function. • Researchers collected data on lead exposure, creatinine clearance (measure of kidney function), age, and other covariates on 965 men 38 Here, age is a confounder
Example (cont’d) 39 Variable Meaning Units X 1 Log 10 (serum lead) Serum led was in ࠵? g/L X 2 Age Years X 3 Body mass index kg/m 2 X 4 Log(GGT) Serum GGT was in U/L X 5 Diuretics 1 = yes (took diuretics) 0 = no (never took diuretics) Y Creatinine clearance ml per minute Model : Y i = α + β 1 X i1 + β 2 X i2 + β 3 X i3 + β 4 X i4 + β 5 X i5 + ࠵? i Used log(serum lead) rather than serum lead, because they expected the effect of lead to be multiplicative, i.e., a doubling of lead concentration would lead to an equal effect on creatinine clearance, no matter the starting value.
Example (cont’d) 40 • Question: after adjusting for effects of the other variables, is there a linear relationship between the logarithm of lead concentration and creatinine clearance? • Result: Best-fit value of β 1 was -9.5 ml/min (95% CI: -18.1 to -0.9 ml/min) – I.e., after accounting for differences in other variables, an increase in log(lead) of 1 unit is associated with a decrease in creatinine clearance of 9.5 ml/min (95% CI: -18.1 to -0.9 ml/min) – Note: 1 unit increase in log(lead) means 10-fold increase in lead concentration (since log base 10 was used) • Since the 95% CI does not include 0, we know that P<0.05.
R 2 : How well does the model fit the data? 41 • R 2 (coefficient of multiple determination) - measures the proportion of variability in the response variable that is explained by the model (i.e., set of variables). • In the example, R 2 is 0.27. • This means that 27% of variation in creatinine clearance is explained by the model. The remaining 73% is explained by other variables (not included in the study) and random variation • Note: can calculate partial R 2 for each variable that quantifies the proportion of variation in the response explained by each predictor after accounting for other predictors
Intuitive Biostatistics Harvey Motulsky Copyright © 2018 Oxford University Press Figure 37.2. The meaning of R 2 in multiple regression. R 2 : How well does the model fit the data? Scatter plot of measured creatinine clearance vs creatinine clearance predicted by the model. R 2 for the predicted and actual values is identical the R 2 from the model.
R 2 vs adjusted R 2 43 • R 2 is commonly used as a measure of fit in multiple regression models.
You've reached the end of your free preview.
Want to read all 56 pages?
- Fall '19
- Regression Analysis, Harvey Motulsky