Lab 6.pdf - STAT 326 LAB 6 Group Number NAMES NAMES In this...

This preview shows page 1 out of 5 pages.

You've reached the end of this preview.

Unformatted text preview: STAT 326 LAB 6 Group Number NAMES: NAMES: In this lab, you will use the utilities in J MP, called Fit Y by X and Fit Model. 1. LEGO. In this example we will consider models aimed at predicting the original price for different LEGO sets. The variables in this example include the following. (a) (b) The Response variable — Numerical: y = Original Price ($) for a LEGO set. Explanatory Variables — Numerical: x1 2 Pieces: Number of LEGO Pieces in the set. — Categorical: Type. This is the type of LEGO set. Levels for Type include: Duplo (D), Traditional (T). In this example we will use the baselevel as LEGO set type Duplo. Create side by side boxplots of the Original price of LEGO sets by the Type of LEGO set. Comment on the similarities and/ or differences in the original price by LEGO set type. Go to Fit Y by X. Put Type into the X7 Factor and put Original price into the Y, Response. There is a lot of overlap between the prices of LEGO sets for the two sets. The spread of prices for the Traditional sets is larger than the spread of prices for the Duplo sets. Create a scatterplot of the Number of Pieces versus the Original price of different LEGO sets. The symbols on the graph are to distinguish between the different types of LEGO sets. The D represents Duplo sets, the T represents traditional LEGO sets. i. Based on this scatterplot comment on the the relationship between number of pieces and the original price of LEGO sets for each of the 2 types of LEGO sets. For each level of LEGO Type there is a strong positive linear relationship between number of pieces and the price of LEGO sets. ii. Clearly explain why the scatterplot suggests including an interaction between number of pieces and type of LEGO set. The visual slope between number of Pieces and the original price of LEGO sets is higher for the Duplo Sets compared to the Traditional sets. Assume DUPLO is the baselevel for LEGO Set Type. Define the indicator variable needed in the space below. Create an appropriate column in JMP for this indicator variable. Select the column Type. Go to Cols. Select Utilities. Select Make Indicator Columns. Change the setting so the new column is numerical. To do this click on the Red chart next to the word Traditional on the left side of the data table. Make sure the Continuous option is selected. (ET = 1 if the set is of type Traditional and 0 if Duplo. (d) (e) (f) Write out the population model for the original price of LEGO sets based on a linear relationship between price and number of pieces and different intercepts for each LEGO set type. Additionally we will assume the relationship between number of pieces and price depends on the Type of LEGO set. (This model should include an intercept and three additional beta coefficients (p = 3). If you have questions please discuss with groups around you and then ask your TA. ) fly 2 50 + [31931 + 32931" + 53181931" Use the JMP utility Fit Model to find the estimate of the model from part 1d. To include any cross product terms in your estimated model highlight both variables required for the cross product and click “Cross” under construct Model Effects. If you hold down the Ctrl key you will be able to select two different variables simultaneously. Be sure to also include all first order terms in the estimated model as well. Reminder: Do not put the categorical column from JMP labeled “Type” into your Fit Model. By default this will create variables with a 1, 0, -1 coding. Instead include the indicator variable column you have created into your model (along with any cross product terms). For this example: Go To Analyze, Fit Model and put Original Price in the Y box. Select “Number of Pieces” and put in Add. Selection “Traditional” and put in Add. Holding down the ctrl key and select “Number of Pieces” and “Traditional”. While both “Traditional” and “Number of Pieces” are selected “Cross” (right below Add). Before Clicking Run make sure you go to model specification and uncheck the option Center Polynomials. i. Write out the full prediction equation. 3) = 9.986 + 0.379131 — 11.739xT — 0268wa ii. Use this output to write the simplified prediction equation for the linear relationship between the Price and Number of pieces for Traditional LEGO sets. Show all your work. g 9.986 + 0.379%1 — 11.739 * 1 — 0.268m1 * 1 g = (9.986 — 11.739) + (0.379 — 0.2629951 g —1.753 + 0.11197,-1 iii. Use this output to write the simplified prediction equation for the linear relationship between the Price and Number of pieces for Duplo LEGO sets. Show all your work. 17 = 9.986 + 0.379371 — 11.739 * 0 — 0.2681131 * 0 g) = 9.986 + 0.379.191 Use your answer to parts (1(e)ii and 1(e)iii) to explain what it means for the relationship between number of pieces and price to depend on the LEGO set type. The linear relationship between number of pieces and the price has different estimated slopes. The slopes of these relationships depend on the LEGO set type. (g) Check the assumption of the form of the model. To answer this questions we create a scatterplot with predicted price on the x axis and residuals on the y axis. In the space below comment on the validity of this assumption. We see positive and negative residuals across the range of predicted values. This suggests we have no violation of the assumption related to the form of the model. (h) Check the assumption of constant variance. To answer this questions we create a scatterplot with predicted price on the x axis and residuals on the y axis. In the space below comment on the valid— ity of this assumption. We see a similar spread of the errors for each predicted value. This suggests we have no concern with the constant variance assumption. (i) Check the assumption of normally distributed errors. To answer this questions we create a normal quantile plot of the residuals. In the space below comment on the validity of this assumption. The Residuals appear to follow the straight line in the normal quantile plot. Furthermore all the ob— servations are between the upper and lower bands. We do not have any concern with the normality assumption. (j) Is the multiple regression model proposed in part (1d) useful in modeling the original price of LEGO sets? Show a full hypothesis test to justify your answer. H0 : [31 2 fig 2 [33 = 0 vs Ha : 51- ¢ 0 for at least one i in (1,2,3) F —Ratio: 170.70 p—value: < 0.0001 Reject the null hypothesis There is statistically significant evidence to suggest this model is helpful in describing the original price of LEGO sets. (k) Complete the R2 interpretation: 91.1 percent of the variability in the price of LEGO sets is explained by the model with explanatory variables of number of LEGO pieces and Type of LEGO set. (1) Based on this example Complete the interpretation of the RMSE: @ percent of the actual original prices of these LEGO will be within 14.86 of the corresponding predicted LEGO set price (In) Conduct a hypothesis test to determine if the relationship between Number of Pieces and the price of LEGO sets depends on the LEGO set Type. H02fi3=0vsHazfl37$0 F —Ratio: —7.06 p—value: < 0.0001 Reject the null hypothesis There is statistically significant evidence to suggest the relationship between Number of Pieces and the price of LEGO sets depends on the LEGO set Type. 2. A conceptual example: Assume you are a manager for a large cereal manufacturer. This manufacturer has recently developed a new brand of cereal called “Morning Chipper”. Your marketing team has proposed 4 different visuals for the outside of the box design (Design A, Design B, Design C, Design D). Your goal is to determine which display type would be best for sales even after accounting for the price of the cereal. You propose to model the sales (response variable with units as number of boxes sold) based linearly on the price (explanatory variables with units of $) and the box design type. Additionally you want this model to account for a possibility that the relationship between price and sales depends on the box design. Assume design version D as the baselevel. (a) Define all variables needed to write out the population model. This requires defining all numerical vari- ables and all indicator variables. $1 is the price of the box of cereal. 95A is 1 if Design A and 0 if Not Design A $3 is 1 if Design B and 0 if Not Design B we is 1 if Design C and 0 if Not Design C (b) Clearly write out the full population model. My = 30 + 31551 + 3231A + 33-103 + fl4$c + fish-TA + fiG-leB + 37151930 (0) Write out the simplified version of the population model for Design version A. fly =50+51$1 +fi2*1+53*0+/34*0+55$1 *1+56-’E1 *0+37-’171 *0 My = 50 + 31551 + 32 + 35551 My = (’80 + ,82) + (,61 + ,35).’t1 ((1) Write out the simplified version of the population model for Design version D. My =30+51$1 +32*0+53*0+34*0+55$1 *0+,86$1 *0+fi7-’131 *0 My = 130 + IBI$1 (e) Assume you collect a random sample of 250 observations. Based on the described model above report the error degrees of freedom for this example. Error Degrees of freedom: n — p — 1 = 250—7—1 = 242 Response Original Price Summary of Fit RSquare 0.911051 RSquare Adj 0.905714 Root Mean Square Error 7.431279 Mean of Response 36.89741 Observations (or Sum Wgts) 54 Analysis of Variance Sum of Source DF Squares Mean Square F Ratio M odel 3 28281.342 9427.11 170.7071 Error 50 2761.196 55.22 Prob > F C. Total 53 31042.537 < .0001 * Parameter Estimates Term Estimate Std Error tRatio Prob> |t| Intercept 9.9860168 2.571364 3.88 0.0003* Number of Pieces 0.3791072 0.037551 10.10 <.0001* Traditional -11.73928 3.622042 -3.24 0.0021 * NumberofPieces*Traditiona| -0.26825 0.037977 -7.06 <.0001* Distributions 0.015 0.05 0.16 0.3 0.5 0.7 0.84 0.95 Normal Quantile Plot Bivariate Fit of Residual Original Price By Pred Formula Original Price 20 D T 15 a, 10 .2 D. E 5 .211 o o '3" .1: -5 8 °‘ -10 -15 -2o 10 20 30 40 50 60 70 80 90 100 Pred Formula Original Price ...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern