Unformatted text preview: STAT 350 Lecture 28 Agenda Today's Topic 1: Mul>ple Linear Regression (11.4 and 11.5) Topic 2: Tests Concerning Hypotheses About a Categorical Popula>on (8.5) Next Week: Three Review Lectures Lab 7 due on Tue Mul>ple Linear Regression Model A general addi>ve mul>ple regression model: Y: dependent variable x1, x2, ..., xk: predictor variables e: random devia>on Normal with mean 0 and standard devia>on independent Example (#25) A trucking company considered a mul>ple regression model for rela>ng the dependent variable y = total daily travel >me (HOURS) to the predictors x1=distance traveled (miles) and X2= number of deliveries made. Suppose that the model equa>on is Ques>on: Y = 0.800 +0.060 x1 + 0.900x2 +e What is the mean value of travel >me when 3 deliveries are made and the distance is 50 miles? How to interpret 1=0.06? How to interpret 2=0.9? Example (#25) Y = total daily travel >me (HOURS) X1=distance traveled (miles) X2= number of deliveries Y = 0.800 +0.060 x1 + 0.900x2 +e Ques>on: If =0.5 hours, what is the probability that travel >me will be at most 6 hours when 3 deliveries are made and the distance is 50 miles? Example 11.12 on page 525 An experiment was carried out to assess the impact of variables force, power, temperature, and >me on the ball bond shear strength. The following data was collected (30 observa>ons total): 'Observa>on' 'Force' 'Power' 'Temperature' 'Time' 'Strength' 1 30 60 175 15 26.2 2 40 60 175 15 26.3 3 30 90 175 15 39.8 4 40 90 175 15 39.7 5 30 60 225 15 38.6 SAS code for mul>ple linear regression SAS output for example 1112 Example the es>mated regression equa>on is =37.48+.2117x1+.4983x2+.1297x3+.2583x4 What are the values for b2 and sb2 ? (power term) b2=0.49833, sb2=0.07019 Interpreta>on of es>mates a is s>ll the intercept bi is the es>mated "slope" for i, it explains how y changes as xi changes while the values of other predictors remain unchanged b1 = 0.2 if the force is increased by 1 (g), the ball bond shear strength will be increased by 0.2 (g) when power, temperature, and >me remain the same. 11.5 Inferences about an individual i Confidence intervals for the slopes Of the form bi (t crit)sbi DF = n k 1 k is the number of predictors in our model For SLR, k=1 df = n 2 For GPA example, k=3 df = n 3 1 = n 4 Example on CI for CI for i :bi (t crit)sbi DF = n k 1 Example 11.14 on page 530 Based on data from example11.12, SAS output provided: =37.48+.2117x1+.4983x2+.1297x3+.2583x4 and b2= 0.49883 and sb2=0.070191 Example on CI for CI for i :bi (t crit)sbi DF = n k 1 b2= 0.49883 and sb2=0.070191 Since n=30 and p=4, df = np1 = 25 t*=2.060 (for 95% CI) 95% CI is: 0.49883(2.060)(0.070191) 0.4980.145 =(0.353, 0.643) Hypothesis Tes>ng for i Hypothesis Tes>ng for 1 (not model u>lity test for MLR) Procedure is similar to SLR: Hypotheses: H0: 1=0 Ha: 10 Test sta>s>cs: t=b1/sb1 =0.21167/0.21057=1.01 pvalue=0.3244 > 0.05 do not reject H0 Conclusion is different from SLR: The force predictor does not appear to provide useful informa>on beyond what is provided by the predictors power, temperature and >me. Variable added last Ttests A ttest in MLR is similar to what it was in SLR Hypotheses: H0: 1 = 0 vs. Ha: 1 0 The difference is this is tes>ng the usefulness or significance of X1 AFTER X2 and X3 are already in the model. Added last Coefficient of Determina>on Coefficient of mul>ple determina>on: Adjusted R2 High values of R2 and adjusted R2 suggest that the model fit is a useful one 8.3 Tests Concerning Hypotheses About a Categorical Popula>on Examples of Categorical Data Gender Color or Type of Transmissions of automobiles Poli>cal Party Registra>on Chisquare Test Invented in 1900 by Karl Pearson Example 1 A gambler is accused of using a loaded die, but he pleads innocent. A record has kept of the last 60 throws. There is a disagreement about how to interpret the data and a sta>s>cian is called in. value expected 1 2 6 3 17 4 16 5 8 6 9 sum 60 Observed 4 Example 1 A gambler is accused of using a loaded die, but he pleads innocent. A record has kept of the last 60 throws. There is a disagreement about how to interpret the data and a sta>s>cian is called in. value expected 1 10 2 6 10 3 17 10 4 16 10 5 8 10 6 9 10 sum 60 60 Observed 4 Chisquare test idea: combine all these differences into one overall measure of the distance between observed and expected value. 2= (observed expected )2/expected 2 has chisquare distribu>on with degree of freedom = #categories 1 Example 1 2= (observed expected )2/expected value expected 1 10 2 6 10 3 17 10 4 16 10 5 8 10 6 9 10 sum 60 60 Observed 4 2= 62/10 + 42/10 + 72/10 + 62/10 + 22/10+ 12/10 = 14.2 df = 5 Example 1 Hypothesis: H0: 1=2=3=4=5=6=1/6 Ha: die is not fair (H0 is incorrect) Test Sta>s>c: 2= 14.2, with df=5 Pvalue < 0.015 => reject H0 at =0.05 Table VII (page 571): Right tail area = 0.015 when 2= 14.09 ChiSquared Distribu>ons A family of distribu>ons, each member is iden>fied by the degree of freedom ChiSquared Distribu>ons No density to the le of zero Posi>vely skewed More symmetric as df increases All our chisquared tests are uppertailed Ex 43 on page 381 Criminologist have long debated whether there is a rela>onship between weather and violent crime. Researchers classified 1361 homicides according to season, resul>ng in the accompanying data. Does this data suggest that the homicide rate somehow depends on the season? Season Winter Spring 334 Summer 372 Fall 327 Frequency 328 Solu>on to ex 43 on page 381
Hypothesis: H0: 1=2=3=4=0.25 Ha: homicide rate depends on the season Season Exp freq Winter 340.25 Spring 334 340.25 Summer 372 340.25 Fall 327 340.2 5 total 1361 1361 Obs Freq 328 Solu>on to ex 43 on page 381 Season Obs Freq Exp freq Winter 328 340.25 Spring 334 340.25 Summer 372 340.25 Fall 327 340.25 total 1361 1361 2= 12.252/340.25 + 6.252/340.25 + 31.752/340.25 + 13.252/340.25 = 4.035 df = ? Pvalue = Solu>on to ex 43 on page 381 Season Obs Freq Exp freq Winter 328 340.25 Spring 334 340.25 Summer 372 340.25 Fall 327 340.25 total 1361 1361 2= 12.252/340.25 + 6.252/340.25 + 31.752/340.25 + 13.252/340.25 = 4.035 Df = 3 Table VII: Pvalue > 0.1 == > do not reject H0 this data does not support the belief that there are different homicide rates in the different seasons. ...
View
Full
Document
This note was uploaded on 02/06/2012 for the course STAT 350 taught by Professor Staff during the Spring '08 term at Purdue.
 Spring '08
 Staff
 Statistics, Linear Regression

Click to edit the document details