# notes12 - Sections 7.1 7.2 7.4& 7.6 Timothy Hanson...

This preview shows pages 1–8. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Sections 7.1, 7.2, 7.4, & 7.6 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 22 Chapter 7 example: Body fat n = 20 healthy females 25–34 years old. x 1 = triceps skinfold thickness (mm) x 2 = thigh circumference (cm) x 3 = midarm circumference (cm) Y = body fat (%) Obtaining Y i , the percent of the body that is purly fat, requires immersing a person in water. Want to develop model based on simple body measurements that avoids people getting wet. 2 / 22 SAS code ******************************* * Body fat data from Chapter 7 *******************************; data body; input triceps thigh midarm bodyfat @@; cards; 19.5 43.1 29.1 11.9 24.7 49.8 28.2 22.8 30.7 51.9 37.0 18.7 29.8 54.3 31.1 20.1 19.1 42.2 30.9 12.9 25.6 53.9 23.7 21.7 31.4 58.5 27.6 27.1 27.9 52.1 30.6 25.4 22.1 49.9 23.2 21.3 25.5 53.5 24.8 19.3 31.1 56.6 30.0 25.4 30.4 56.7 28.3 27.2 18.7 46.5 23.0 11.7 19.7 44.2 28.6 17.8 14.6 42.7 21.3 12.8 29.5 54.4 30.1 23.9 27.7 55.3 25.7 22.6 30.2 58.6 24.6 25.4 22.7 48.2 27.1 14.8 25.2 51.0 27.5 21.1 ; proc sgscatter; matrix bodyfat triceps thigh midarm; run; 3 / 22 Scatterplot 4 / 22 Correlation coefficients proc corr data=body; var triceps thigh midarm; run; Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 triceps thigh midarm triceps 1.00000 0.92384 0.45778 <.0001 0.0424 thigh 0.92384 1.00000 0.08467 <.0001 0.7227 midarm 0.45778 0.08467 1.00000 0.0424 0.7227 There is high correlation among the predictors. For example r = 0 . 92 for triceps and thigh. These two variables are essentially carrying the same information . Maybe only one or the other is really needed. In general, one predictor may be essentially perfectly predicted by the remaining predictors (a high “partial correlation”), and so would be unecessary if the other predictors are in the model. 5 / 22 7.1 Extra sums of squares “Extra” sums of squares are defined as the difference in SSE between a model with some predictors and a larger model that adds additional predictors. Fact : As predictors are added, the SSE can only decrease. The extra sums of squares is much the SSE decreases: def’n Let x 1 , x 2 ,..., x k be predictors in a model. SSR ( x 1 , x 2 , . . . , x j | x j +1 , . . . , x k ) = SSE ( x 1 , x 2 , . . . , x j )- SSE ( x 1 , x 2 , . . . , x j , x j +1 , . . . , x k ) , the difference in the sums of squared errors from the reduced to the full model. This is how much of the total variation in SSTO is further explained by adding the new predictors. 6 / 22 Example with k = 8 predictors The predictors under consideration are x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , x 8 ....
View Full Document

{[ snackBarMessage ]}

### Page1 / 22

notes12 - Sections 7.1 7.2 7.4& 7.6 Timothy Hanson...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online