**Unformatted text preview: **STAT 326 LAB 4 DUE IN LAB
1. NAME: — Group#:
2. NAME: — Group#:
3. NAME: Group#:
4. NAME: — Group#: —
LEGOs. In this example we consider a linear model aimed at predicting the Original Listing price for different
LEGO sets based on the number of pieces in the LEGO set and the number of ﬁgures (not including animals) in the
LEGO set. This dataset consists of information taken from 37 randomly selected LEGO sets. 0 J MP column label reference — Number: This is the LEGO box reference number — Original Price (y): This is the Orginal Listing Price of the LEGO sets. Notice the prices are in multiple
of 5 dollars minus 1 cent. Example: 999, 14.99, 19.99 etc. — Number of Pieces (.731): This is the number of LEGO Pieces in the set as recorded on the LEGO box. — Figures (:52): This is the number of human LEGO ﬁgures (not animals) in a set as seen on the LEGO
box. For all your ﬁnal calculations, please round to the nearest hundredth (cents). 1. (Model 1) First ﬁt a simple linear regression line predicting the original price based on the number of pieces in
the LEGO set. Record the estimated equation in the space below. Also record the R2 value and interpret this
value in the context of this example. 0 Estimated equation: 3) = 1.227 + 0.1033;
c x is the number of pieces 0 g is the predicted price given the number of pieces.
Interpretation of the R2:
85.16 percent of the variability in the price of LEGO sets is explained by the linear model with explanatory
variable number of pieces. 2. (Model 2) Next ﬁt a simple linear regression line predicting the original price based on the number of ﬁgures
in the LEGO set. Record the estimated equation in the space below. Also record the R2 value and interpret
this value in the context of this example. 0 Estimated equation: 17 = 10.905 + 9.2973:
c x is the number of ﬁgures 0 g) is the predicted price given the number of ﬁgures.
Interpretation of the R2: 36.9 percent of the variability in the price of LEGO sets is explained by the
linear model with explanatory variable number of ﬁgures. 3. Create a scatterplot of the original price (y axis) by the number of pieces. We have already created symbols for you. An A is displayed for sets with 1 ﬁgure, a B is displayed for sets with 2 ﬁgures a C is displayed for sets
with 3 ﬁgures and a D is displayed for sets With 4 or more ﬁgures. Save this plot for your own records. In the
space provided discuss any characteristics you observe. You should discuss the 4 features of the scatter plot
(between original price and number of pieces). Further you should discuss what you observe in the original
price point for sets with different numbers of ﬁgures (speciﬁcally your discussion should relate to differences in
prices even for LEGO sets with a similar number of pieces).
There is a strong positive linear relationship between the number of pieces and the price of LEGO sets. There
are no obvious outliers in this example. After accounting for the number of pieces there appears to additionally
be higher prices in sets with more Figures. For example the sets with 4 or more ﬁgures (coded with a D in the
plot) tend to have higher prices compared to sets with 1 or 2 ﬁgures (coded with an A or B in the plot) even
for sets with a similar number of pieces. 4. For the remainder of the lab we will use a multiple regression model to describe the original price of LEGO
sets. Let y be the original Price of LEGO sets, an the Number of pieces in the LEGO set and $2 be the number
ﬁgures in the LEGO set. Here we will assume the population model: My = 50 + ﬂlxl + 52352 In the space below provide an interpretation of the population intercept in the context of this problem. Include
the interpretation even if the interpretation doesn’t make much sense from a practical perspective. 50 is the mean price of LEGO set with no pieces and no ﬁgures. You should not have any numbers related to
the estimated model here with your solution. Remember this is a population parameter interpretation. 5. Explain why the interpretation of the population intercept is not very practical for this example.
Hopefully it is clear that a LEGO set would not exist if there were no ﬁgures and no LEGO pieces. Again
you should not have any output numbers referenced in your answer here. This discussion is reacted to the
population intercept. 6. (Model 3) Use the Fit Model utility in JMP to ﬁt the estimated multiple regression model. Save the column
of residuals and the column of predicted values. Record the estimated model in the space below. :17 = —5.5267 + 0.0907031 + 4.1279132 7. 10. 11. (Model 3) Interpret the estimate for the slope associated with Number of Pieces. For every addition piece we predict the price of a LEGO set to increase by 9.07 (or just 9) cents assuming the
number of LEGO ﬁgures does not change. . (Model 3) Interpret the estimate for the slope associated with Figures. For each additional Figure we predict the price of a LEGO set to increase by $4.13 assuming the number
of LEGO pieces remains the same. NOTE: In practice a regular LEGO piece would technically have to be
replaced with a Figure piece for the number of pieces in the set to actually remain constant. (Model 3) Assume that LEGO is considering offering a new LEGO set with 5 Figures and 400 pieces. Use your estimated model to help propose a good price for the LEGO set.
3] = —5.5267 + 0.0907 * 400 + 4.127 * 5 = 51.39 This prediction suggests a proposed price of $51.39. As a side note: The original Prices of the sets appear to
be grouped by $5 minus a penny. If we took this into account it is likely a marketer would suggest a price of
either 49.99 or 54.99. (Model 3) Assume two LEGO sets have the same number of Figures. How much more would you expect to
pay for the set with 200 additional pieces? 200*0.0907 = 18.14.
We would predict to pay $18.14 more for a set with 200 additional pieces. (Model 3) As a consumer would you prefer to pay for a LEGO set with a positive residual or a negative residual?
Explain your answer. I would prefer to pay for a LEGO set with a negative residual. This would mean the actual price I paid would
be lower than the predicted price based on the number of pieces and the number of ﬁgures. This would mean
I would get more pieces for my money. NOTE: One may argue they would hope to get better pieces by paying
more per piece (with a positive Residual). As long as the argument supports the direction you choose this is
reasonable. Fit Group
Bivariate Fit of Original Price By Number of Pieces Bivariate Fit of Original Price By Figures 100 100 80
.5 .5 60
E E
E" E"
31° .20
o o 40
20
0
0 1 00 200 300 400 500 600 700 800 0 1 2 3 4 5 6 7 8 9
Number of Pieces Figures Linear Fit Linear Fit Original Price = 1.2276593 + 0.1033135"Number of Pieces Original Price = 10.905033 + 9.2973856*Figures
Summary of Fit Summary of Fit RSq uare 0.851632 RSquare 0.368942
RSquare Adj 0.847393 RSquare Adj 0.350911
Root Mean Square Error 8.598023 Root Mean Square Error 17.73225
Mean of Response 35.53054 Mean of Response 35.53054
Observations (or Sum Wgts) 37 Observations (or Sum Wgts) 37
Sum of Sum of
Source DF Squares Mean Square F Ratio Source DF Squares Mean Square F Ratio
Model 1 14851.779 14851.8 200.9006 Model 1 6434.042 6434.04 20.4624
Error 35 2587.410 73.9 Prob > F Error 35 11005.147 314.43 Prob > F
C. Total 36 17439.189 < .0001 * C. Total 36 17439.189 <.0001* Parameter Estimates Parameter Estimates Term Estimate Std Error tRatio Prob> |t| Term Estimate Std Error tRatio Prob> |t|
Intercept 1 .2276593 2.802688 0.44 0.6641 Intercept 1 0.905033 6.1 75259 1 .77 0.0861
Number of Pieces 0.1033135 0.007289 14.17 < .0001* Figures 9.2973856 2.055336 4.52 < .0001* Response Original Price
Summary of Fit RSquare 0.911706
RSquare Adj 0.906512
Root Mean Square Error 6.729599
Mean of Response 35.53054
Observations (or Sum Wgts) 37
Sum of
Source DF Squares Mean Square F Ratio
M odel 2 15899.414 7949.71 175.5386
Error 34 1539.775 45.29 Prob > F
C. Total 36 17439.189 < .0001* Parameter Estimates Term Estimate Std Error tRatio Prob>|t|
Intercept -5.526762 2.604655 -2.12 0.0412* Number of Pieces 0.0907332 0.006276 14.46 <.0001*
Figures 4.127177 0.858099 4.81 <.0001* ...

View
Full Document

- Fall '08
- GENSCHEL
- Statistics