# notes10 - Chapter 6 Multiple Regression Timothy Hanson...

This preview shows pages 1–7. Sign up to view the full content.

Chapter 6 Multiple Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 25

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6.7 CI for mean response and PI for new response Let’s construct a CI for the mean response corresponding to a set of values x h = 1 x h 1 x h 2 . . . x hk . We want to make inferences about E ( Y h ) = x 0 h β = β 0 + β 1 x h 1 + ··· + β k x hk . 2 / 25
Some math. .. A point estimate is ˆ Y h = \ E ( Y h ) = x 0 h b . Then E ( ˆ Y h ) = E ( x 0 h b ) = x 0 h E ( b ) = x 0 h β . Also var( ˆ Y h ) = cov( x 0 h b ) = x 0 h cov( b ) x h = σ 2 x 0 h ( X 0 X ) - 1 x h . So. .. A 100(1 - α )% CI for E ( Y h ) is ˆ Y h ± t n - p (1 - α/ 2) q MSE x 0 h ( X 0 X ) - 1 x h , A 100(1 - α )% prediction interval for a new response Y h = x 0 h β + ± h is ˆ Y h ± t n - p (1 - α/ 2) q MSE [1 + x 0 h ( X 0 X ) - 1 x h ] , 3 / 25

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Dwayne Studios Say we want to estimate mean sales in cities with x 1 = 65 . 4 thousand people 16 or younger and per capita disposable income of x 2 = 17 . 6 thousand dollars. Now say we want a prediction interval for a new city with these covariates. We can add these covariates to the data step, with a missing value “ . ” for sales, and ask SAS for the CI and PI. data studio; input people16 income sales @@; label people16=’16 & under (1000s)’ income =’Per cap. disp. income (\$1000)’ sales =’Sales (\$1000\$)’; datalines; 68.5 16.7 174.4 45.2 16.8 164.4 91.3 18.2 244.2 47.8 16.3 154.6 46.9 17.3 181.6 66.1 18.2 207.5 49.5 15.9 152.8 52.0 17.2 163.2 48.9 16.6 145.4 38.4 16.0 137.2 87.9 18.3 241.9 72.8 17.1 191.1 88.4 17.4 232.0 42.9 15.8 145.3 52.5 17.8 161.1 85.7 18.4 209.7 41.3 16.5 146.4 51.7 16.3 144.0 89.6 18.1 232.6 82.7 19.1 224.1 52.3 16.0 166.5 65.4 17.6 . ; proc reg data=studio; model sales=people16 income / clm cli alpha=0.05; Output Statistics Dependent Predicted Std Error Obs Variable Value Mean Predict 95% CL Mean 95% CL Predict Residual 1 174.4000 187.1841 3.8409 179.1146 195.2536 162.6910 211.6772 -12.7841 21 166.5000 157.0644 4.0792 148.4944 165.6344 132.4018 181.7270 9.4356 ...et cetera. .. 22 . 191.1039 2.7668 185.2911 196.9168 167.2589 214.9490 . 4 / 25
6.8 Checking model assumptions The general linear model assumes the following: 1 A linear relationship between E ( Y ) and associated predictors x 1 , . . . , x k . 2 The errors have constant variance. 3 The errors are normally distributed. 4 The errors are independent. We estimate the unknown ± 1 , . . . , ± n with the residuals e 1 , . . . , e n . Assumptions can be checked informally using plots and formally using tests. Note : We can’t check E ( ± i ) = 0 because e 1 + ··· + e n = 0, i.e. ¯ e = 0, by construction. 5 / 25

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Assumption 1: Linear mean Scatterplots of { ( x i 1 , Y i ) } n i =1 for each predictor j = 1 , . . . , k .
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 25

notes10 - Chapter 6 Multiple Regression Timothy Hanson...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online