This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: STAT 252 Fall 2006 Solutions to the practice test are in the separate ﬁle Solutions to Practice Exam 3. Try working on all of the
problems before you look at the solutions 1) True/False. Answer true or false to the following statements.
a) We cannot include qualitative and quantitative predictors and a quadratic term for the
quantitative predictor because it makes the model too complicated. FALSE b) Suppose we have the hypothesized regression model relating average income to gender".
E(y) = :60 +ﬂlxl where y=incomeand xl =1 ifpersonisamaleand x1: 0 if
person is a female. Then the null hypothesis [31 = O is equivalent to the null hypothesis that
the average incomes of males and females are equal. TRUE 0) A 100(1 — a)% conﬁdence interval for the mean value of y for a given value of x is always wider than 3 100(1 — a)% prediction interval for a new value of y for a given value
of x . FALSE (1) The interaction between the two independent var'nbles x, and x2 in the hypothesized model
E (y) = [30 + B1 x1 + [32x2 + 33x} x2 implies that the linear relationship between y and
x2 depends on x]. TRUE 2) A realtor wanted to investigate the relationship between the sales price of a home and some
characteristics of the home. She ﬁrst explored the relationship between price of the home (in
$1000’s) and size ofthe home (in 1000’s ofsquare feet). The MlNlTAB output based on a
sample of 150 homes is given below: The regression equation is
Home Price = 63.7 + 49.4 Size Predictor Coef SE Coef T P
Constant 63.745 8.000 7.97 0.000
Size 49.375 4.177 11.82 0.000 s = 29.9456 R—Sq = 48.6% R—Sq(adj) = 48.2% Analysis of Variance Source DF SS MS F P
Regression 1 125271 125271 139.70 0.000
Residual Error 148 132717 897 Total 149 257989 a) Letting y = home price (3), x 2 Size. A 95% prediction interval for a new value of y for x = .85 is givenby (45 .80, 165 . 63). Interprettheprediction interval inthe context of
the problem. mt; 95% conﬁdence, we predict that the price of a home .that is 850 square feet is $45, 800 and
, 630. Now the realtor decides to regress the sales price of the home (in $1000’s) on size of the home (in 1000’s ofsquare feet), the size oflmﬂiatthehomeresides (in acres), andthenumber ofbaths inthe
home. Some MINITAB output is given below: Predictor Coef SE Coef T P
Constant 57.980 6.403 9.05 0.000
Size 32.563 5.191 6.27 0.000 Acres 7.5126 0.8787 8.55 0.000
Baths 14.806 4.315 3.43 0.001 s = 23.6989 R—Sq = 68.2% R—Sq(adj) = 67.6% Analysis of Variance Source DF SS MS F P
Regression 3 175989 58663 104.45 0.000
Residual Error 146 81999 562 Total 149 257989 b) Letting y 2 sales price (S), x] = Size, x2 2 Acres, andx3 2 Baths, write the hypothesized
model that relates average sales price to Size, Acres, and Baths EU) 2 Bo + 31x1 + ﬂzxz + 63x3
c) Based on the MINITAB output, write out the least squares prediction equation. ,9 = 57.98 +32.563x1 + 7.5126x2 +14.806x3 d) Interpret the estimated coefﬁcient for the independent variable Acres. The average price of a home is estimated to inaease by 37512.6 for each additional acre of
property, when ﬁxing the square footage of the home and the number of baths in the home. e) Conduct the appropriate testto determine if the number of baths in the home is positively related to the sales price when Acresand Areaareheldﬁxed. Use a = .01 . What is the p
value for this test? We are testing:
H0 :33 =Oversus Ha 2% >0 The pvalue for this test is .OO1/2=. 0005. Since this is smaller than .01, we would reject the null
hypothesis. Conclude at the .01 level that the number of baths in the home is positively related to
the sales price (ﬁxing ACRES and AREA) 1) Construct a 90% conﬁdence interval for [32 and interpret it. 13; itmsi) = 7.5126i1.658(.8787) (6.05572, 8.96948) We are 90% conﬁdent that the true increase in average home price for each additional acre in the
size of the lot is between $6055. 72 and $8969.48, when ﬁxing square footage and number of
bathrooms. g) A 95% conﬁdence interval for E(y) when xl : 2 , x2 2 l , and x3 = 3 is given by:
(166. 19, 183 . 88). Interpret this interval inthe context ofthe problem. We are 95% conﬁdentthatthe true averagepn‘ce ofaﬂhomesthatarezooo square feet, sit on 1
acre lots, and have 3 bathrooms is between 3166, 190 and $183, 880. h) Interpret the R in) value. 67. 6% of the variation in home price can be explained by area of me home, acres of property,
and number of baths in the home when adjusting for sample size and number of independent
variables in the model. i) Conduct the appropriate test to determine if including the additional predictors Acres and
Baths to the model with only Size as a predictor signiﬁcantly contributes to the prediction of
sales price. Use a = .01 We are testing:
H0 1,62 = ,83 = 0 versus Ha :Atleastoneﬂi at 0,i : 2,3 2 — 9 —
Test Statistic: Fm = W = 45.15
81,999 /(150 — 4) Critical value: F m = 4.79 Since 45.15 > 4. 79, reject the null hypothesis.
Conclude at the .01 level that the additional predictors Acres and Baths signiﬁcantly contributes to
the prediction of sales price of the home. 3) An article used multiple regression to predict annual rainfall levels in California. Data on the
annual precipitation y (Precip, measured in inches), altitude x1 (Altitude, measured in feet above sea level) and x2 (Near Coast) a dmmny variable that indicates whether the station is close to (within 10 miles of) the Paciﬁc coast (1 if the station is within 10 miles, 0 otherwise) were
collected from 30 meteorological stations throughout California Some regression output from
MINITAB is given below: Regression Analysis: Precip versus Altitude, Near Coast, Alt_Coast Predictor Coef SE Coef T P
Constant 9.104 ’ 3.644 2.50 0.019
Altitude 0.004011 0.001288 3.11 0.004
Near Coast 30.95 11.58 2.67 0.013
Alt_Coast —0.1525 0.1700 ~0.90 0.378 S = 13.6048 R—Sq = 39.9% R—Sq(adj) = 33.0%
Analysis of Variance Source DF SS MS F P
Regression 3 3199.3 ? ? ?
Residual Error ? 4812.4 185.1 Total 29 8011.7 a) Write the hypothesized model that relates mean annual precipitation to altitude, closeness
to the coast, and an interaction between altitude and closeness to the coast. EU) 2 ,30 +ﬂ1xl +ﬂzxz +i63x1x2 b) Write out the least squares prediction equation for the hypothesized model in part (a)
using the MINIT AB output. )7 = 9.1+.oo4x1 +3095):2 — .153x1x2 c) What is the estimated slope of the line for the regression of rain fall amount on altitude
for locations near the coast? Interpret it in the context of the problem. Near the coast means x2 = 1 :> )7 = 9.1 + .004x, + 3095(1) — .153x, (1) = 40.05 — .149x, So the estimated slope is . 149. For each additional foot in altitude of the station, average
measured rainfall is estimated to decrease by. 149 of an inch when the location is near the coast. (1) Construct a 95% conﬁdence interval for the beta coefﬁcient corresponding to the
variable Altitude, and interpret the interval. df :n—(k+1) =30—(3 +1): 26 so 20,, 22.056
[5, trims r1 =.004i2.056(.0013) = (.0013,.0067) Interpretation: Fixing the location of the station (new or far from the coast), we are 95% conﬁdent that the increase in average rainfall precipitation for each additional foot in altitude is between
.0013 and .0067 of an inch. e) Predict the amount of rainfall for a station that is 150 feet above sea level, and that is not
near the Paciﬁc Ocean )7 = 9.1 + 004(150) + 3095(0) —.153(150)(0) = 9.7 inches 1) Conduct the appropriate test to determine if the linear relationship between amount of
participation and altitude diﬂ‘ers depending on whether or not a station is close to the coast. Use a 2.01. Hozﬂ3=0
H0:,83¢0 pvalue=.378 > .01 so do not reject the null hypothesis. Can’t conclude at the .01 level that the linear relationship between amount of precipitation and altitude diﬁers depending on whether or
not a station is close to the coast. g) Conduct the appropriate test to determine if the model is useful for predicting the amount
of precipitation Use a = .01 .
h) H0:ﬂ1:ﬂ2:ﬂ3=0
H0 :At least oneﬂ, i O, i = 1,2,3
MSR 9 3199.3 F=—=—'—, whereMSR=
MSE 185.1 3 = 1066.433 50F =M§3=576 ”b‘ 185.1
The critical value is: Fm : 4.64 Since 5. 76 > 4. 64, reject the null hypothesis. Conclude at the .01 level that altitude, closeness to
the coast, and the interaction between the Mo variables together signiﬁcantly contribute to the
prediction of rainfall amount. 4) Draw a scatterplot describing the bivariate relationship between x and y that ﬁts the given
description. Note: Use at least 10 points when constructing your scatterplot. a) A relationship with a very inﬂuential data point (make sure to label the inﬂuential point
clearly). 10 9 ‘2 6
Li_L __4_A#__ .___L c {c
on
a 4 U “"‘*—r—‘I““——"r—‘T——"7’— b) A relationship with a data point that has high leverage, but is not very inﬂuential (make sure
to label the high leverage point clearly). 10
X ...
View
Full Document
 Fall '05
 staff

Click to edit the document details