This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Chapter 6 Multiple Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 33 6.1 Multiple regression models We now add more predictors, linearly, to the model. For example let’s add one more to the simple linear regression model: Y i = β + β 1 x i 1 + β 2 x i 2 + i , with the usual E ( i ) = 0. For any Y in this population with predictors ( x 1 , x 2 ) we have μ ( x 1 , x 2 ) = E ( Y ) = β + β 1 x 1 + β 2 x 2 . The triple ( x 1 , x 2 ,μ ( x 1 , x 2 )) = ( x 1 , x 2 ,β + β 1 x 1 + β 2 x 2 ) describes a plane in R 3 (p. 215). 2 / 33 Multiple regression models Generally, for k = p 1 predictors x 1 ,..., x k our model is Y i = β + β 1 x i 1 + β 2 x i 2 + ··· + β k x ik + i , (6.7) with mean E ( Y i ) = β + β 1 x i 1 + β 2 x i 2 + ··· + β k x ik . (6.8) β is mean response when all predictors equal zero (if this makes sense). β j is the change in mean response when x j is increased by one unit but the remaining predictors are held constant . We will assume normal errors: 1 ,..., n iid ∼ N (0 ,σ 2 ) . 3 / 33 Dwayne Portrait Studio data (Section 6.9) Dwayne Portrait Studio is doing a sales analysis based on data from 21 cities. Y = sales (thousands of dollars) for a city x 1 = number of people 16 years or younger (thousands) x 2 = per capita disposable income (thousands of dollars) Assume the linear model is appropriate. One way to check marginal relationships is through a scatterplot matrix. However, these are not infallible. For these data, is β interpretable? β 2 is the change in the mean response for a thousanddollar increase in disposable income, holding “number of people under 16 years old” constant. 4 / 33 SAS code data studio; input people16 income sales @@; label people16=’Number 16 and under (thousands)’ income =’Per capita disposable income ($1000)’ sales =’Sales ($1000$)’; datalines; 68.5 16.7 174.4 45.2 16.8 164.4 91.3 18.2 244.2 47.8 16.3 154.6 46.9 17.3 181.6 66.1 18.2 207.5 49.5 15.9 152.8 52.0 17.2 163.2 48.9 16.6 145.4 38.4 16.0 137.2 87.9 18.3 241.9 72.8 17.1 191.1 88.4 17.4 232.0 42.9 15.8 145.3 52.5 17.8 161.1 85.7 18.4 209.7 41.3 16.5 146.4 51.7 16.3 144.0 89.6 18.1 232.6 82.7 19.1 224.1 52.3 16.0 166.5 ; proc sgscatter; matrix people16 income sales / diagonal=(histogram kernel); run; options nocenter; proc reg data=studio; model sales=people16 income / clb; * clb gives 95% CI for betas; run; * alpha=0.9 for 90% CI, etc.; 5 / 33 SAS output The REG Procedure Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 24015 12008 99.10 <.0001 Error 18 2180.92741 121.16263 Corrected Total 20 26196 Root MSE 11.00739 RSquare 0.9167 Dependent Mean 181.90476 Adj RSq 0.9075 Coeff Var 6.05118 Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > t 95% Confidence Limits Intercept Intercept 168.85707 60.016951.15 0.2663194.94801 57.23387 people16 Number 16 and 1 1.45456 0.21178 6.876....
View
Full
Document
This note was uploaded on 12/14/2011 for the course STAT 704 taught by Professor Staff during the Fall '11 term at South Carolina.
 Fall '11
 Staff
 Statistics

Click to edit the document details