The following SAS commands invoke the REG procedure and fit model (1) to the data: proc reg; model y=x; run; where y is the response variable and x is the regressor variable. To fit a model to the data, you must specify the MODEL statement. 3.1 Using proc reg Create a simple regression model for FVC among children of 5-18 years old. Start the model by including only age as a covariate. proc reg data = restricted; model fvc = ageyrs; title "Linear regression model" ; run; Interpret the beta coefficient for age. 3.2 Multivariate regression Now add more covariates in the model. Interpret the beta coefficient for sex. proc reg data = restricted; model fvc = ageyrs height sex wt; title "Linear regression model with covariates" ; run ; 4. General Linear Models: The GLM procedure (PROC GLM) The GLM procedure can also be used to fit linear regression models. Some notable differences in coding between the two procedures that may affect the choice of the one over the other are the following: 4.1 Parameter estimates output PROC REG: By default, it outputs the parameter estimates, their standard errors, the test- 6
EPI204 – Lab Session 1 statistics and p-values. PROC GLM: The SOLUTION option in the MODEL statement must be specified when the model includes categorical variables and the CLASS statement is used. 4.2 Categorical variables as independent variables PROC REG: Indicator variables for each level of the categorical variable must be created. PROC GLM: It can accommodate categorical variables as long as these variables are included in the CLASS statement. If the CLASS statement is not specified, SAS will treat the categorical variable as a continuous variable with values equal to the levels of the categorical variable. 4.3 Using proc reg and proc glm Let’s check whether race (as indicator variables) is an important predictor of FVC using proc reg. proc reg data = restricted; model fvc =ageyrs height sex wt black other; title "REG: Linear regression model" ; run ; Let’s check whether race (as indicator variables) is an important predictor of FVC using proc glm. proc glm data = restricted; class race; model fvc = ageyrs height sex wt race/ solution ; title "GLM: Linear regression model" ; run ; Are your coefficients the same? Why or why not? Proc Reg Betas Proc GLM Betas Intercept -389.5 -401.5 Ageyrs 4.7 4.7 Height 3.5 3.5 Sex 32.7 32.7 Wt 1.7 1.7 White/Race=1 NA 12.0 Black/Race=2 -46.7 -34.7 Other/Race=3 -12.0 NA SAS treats the highest category as the reference category for comparisons between levels of a categorical variable when you use a class statement. Why? Because they are idiots. To change the reference category for race to “whites” instead of “other”, we need to re-order the data and use the order=data option in glm. Note that the coefficients or sex, height, and age are the same in both models.
You've reached the end of your free preview.
Want to read all 9 pages?