{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Chapter 9

Chapter 9 - F2011 STAT 410/510 Chapter 9 Building the...

This preview shows pages 1–4. Sign up to view the full content.

F2011 STAT 410/510 1 Chapter 9 Building the regression model I: Model selection and validation Surgical Unit Example: A hospital surgical unit was interested in predicting survival ( Y ) in patients undergoing a particular type of liver operation. A random selection of 108 patients was available for analysis. The predictor variables for the predictive regression model were : 1 X Blood clotting score : 2 X Prognostic index : 3 X Enzyme function test score : 4 X Liver function test score : 5 X Age in years : 6 X Indicator variable for gender (0 = male; 1 = female) 7 X and : 8 X Indicator variable for history of alcohol use: Alcohol use 7 X 8 X None 0 0 Moderate 1 0 Severe 0 1 See Table 9.1 for potential predictor variables and response variable Because the researchers intended to validate the final model, the sample was split into a model-building set (first 54 patients) and a validation set (second 54 patients). For illustration, we use only the first four variables: 4 3 2 1 , , , ~ X X X X Y proc reg data =surgical; model Y_Survival =X1_Blood_Clotting_Score X2_Prognostic_Index X3_Enzyme_Test X4_Liver_Test; output out =results r =residual p =phat; plot r. * p. ; plot npp. * r. ; run ; The REG Procedure

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
F2011 STAT 410/510 2 Model: MODEL1 Dependent Variable: Y_Survival Y_Survival Number of Observations Read 54 Number of Observations Used 54 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 4 5783681 1445920 27.40 <.0001 Error 49 2585839 52772 Corrected Total 53 8369521 Root MSE 229.72207 R-Square 0.6910 Dependent Mean 702.09259 Adj R-Sq 0.6658 Coeff Var 32.71963 Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > |t| Intercept Intercept 1 -1279.24162 243.80838 -5.25 <.0001 X1_Blood_Clotting_Score X1_Blood Clotting Score 1 82.98825 26.40215 3.14 0.0028 X2_Prognostic_Index X2_Prognostic Index 1 8.34587 2.11971 3.94 0.0003 X3_Enzyme_Test X3_Enzyme Test 1 10.86961 1.92322 5.65 <.0001 X4_Liver_Test X4_Liver Test 1 49.34624 47.12592 1.05 0.3002 Y_ Su r v i v a l = - 1 2 7 9 . 2 +8 2 . 9 8 8 X1 _ Bl o o d _ Cl o t t i n g _ Sc o r e +8 . 3 4 5 9 X2 _ Pr o g n o s t i c _ I n d e x +1 0 . 8 7 X3 _ En z y me _ Te s t +4 9 . 3 4 6 X4 _ L i v e r _ Te s t N 5 4 Rs q 0 . 6 9 1 0 Ad j Rs q 0 . 6 6 5 8 RMSE 2 2 9 . 7 2 - 5 0 0 - 2 5 0 0 250 500 750 1000 Pr e d i c t e d Va l u e - 2 0 0 0 200 400 600 800 1000 1200 1400 1600 This suggests both nonconstant error variances and curvature . That is, the variability of the residuals appears to increase as the magnitude of the predicted values increases. For residual and normal probability plots, see Figure 9.2. This suggests the distribution of the residuals may not be a member of the family of normal distributions.
F2011 STAT 410/510 3 So, the investigator performed a logarithmic transformation of Y . proc

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 21

Chapter 9 - F2011 STAT 410/510 Chapter 9 Building the...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online