Stats 371 - Stat 371 Assignment 3 Solution 1 The purpose of this question is to review some of the tools used to assess the fit of a model and to

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Stat 371 Assignment 3 Solution 1. The purpose of this question is to review some of the tools used to assess the fit of a model and to look for outliers in the data set. To do so we use an artificial example with 50 observations on a response variate y and two explanatory variates x 1 and x 2 . The data are stored in the file ass3newq1.txt available on the course web page. The goal of the investigation is to predict y when x x 12 15 15 = = , . Start by fitting the model y x x r =+ + + β 01 2 . a) Use plots of the estimated residuals and qq plots of the standardized residuals to determine if other terms (e.g. squares and products) or a transformation is needed. To create a nice format for your plots, you might like to use the R code par(mfrow=c(n,m)) where you select integers n and m. This creates an n x m array for the next nm plots you create. Below are plots of the estimated residuals versus the fitted values, the qq plot of the estimated residuals, the leverages and the studentized residuals from fitting the model y x x r + + 2 . The qq plot suggests that there are too many large and small estimated residuals for the sample to be from a normal distribution. We also see some very large and small studentized residuals. We need a different model.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 After messing about for a while, I fit the model with added terms for xx 1 2 2 2 , and x x 12 . The four plots described above are With the exception of one point with large h ii and one (or two) large studentized residuals the plots suggest a reasonable fit. The summary for the model is Call: lm(formula = y ~ x1 + x2 + x11 + x22 + x12) Residuals: Min 1Q Median 3Q Max -3.5698 -0.9150 -0.2668 1.1260 3.4817 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 19.23044 3.09586 6.212 1.65e-07 *** x1 0.86260 0.36600 2.357 0.02295 * x2 -1.97900 0.34593 -5.721 8.67e-07 *** x11 0.03154 0.01170 2.695 0.00992 ** x22 0.01683 0.01182 1.424 0.16141 x12 0.24473 0.01469 16.664 < 2e-16 *** --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: 1.58 on 44 degrees of freedom Multiple R-Squared: 0.9912, Adjusted R-squared: 0.9902
Background image of page 2
3 F-statistic: 990.3 on 5 and 44 DF, p-value: < 2.2e-16 b) Decide on a final form for the model. Based on the above summary and plots, I decided to drop x 2 2 from the mode. The resulting summary and plots suggest that yx x x x x r = + + + + + β 01 12 23 1 2 412 is a reasonable model.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 05/12/2011 for the course STAT 371 taught by Professor Ahmed during the Fall '09 term at Waterloo.

Page1 / 9

Stats 371 - Stat 371 Assignment 3 Solution 1 The purpose of this question is to review some of the tools used to assess the fit of a model and to

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online