101B_hw_4_answers_W12steven

101B_hw_4_answers_W12steven - Statistics 101B Homework four...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Statistics’101B ’ Homework’four’Answers ’ Winter’2012 ’ Question one. Problem one from chapter three 3.4 Exercises Answer to question one. a) The analyst makes the mistake of believing that a high r-square indicates a good fit of the data to the model. In fact, a good fit of the data to the model is required *before* we analyze r-square. b) The plot of standardized residuals against distance shows a marked departure from linearity. So, no, the straight-line regression model does not fit well. The problem asks you to carefully describe how the model can be improved, and you won't be graded on your answers to this part. In fact, the model can be improved by adding a new term of distance^2. But we haven't talked about that yet. Question two. Problem three from chapter three 3.4 Exercises a) predicted log(AdRevenue) = 4.67 + 0.53*log(Circulation) A transform (or something) is clearly needed, since the untransformed variables do not fit a linear trend nor do the errors appear normally distributed. One can clearly see from scatterplots that transforming only one of the variables (either log(x) or log(y)) also does not work. However, transforming both variables seems to provide a decent model: • the trend appears to be linear, as evidence by a residual plot that shows no trend. • qq-plot suggests that residuals are (for the most part) Normally distributed, with one or two exceptional points in the left tail [observations 60, 64] • there is perhaps a tendency towards increasing variance for larger predicted values, but this trend is slight or, at least, slighter than any other set of transformations b) The results of the predict command are returned in log units: > predict(LogLog.model, newdata=data.frame(Circulation=c(0.5, 20)),interval="prediction") fit lwr upr 1 4.308227 3.947855 4.668600 2 6.258752 5.885815 6.631689 So for a circulation of .5 million we predict exp(4.308227)= 74.3, lwr=51.8, upr=106.5 For circulation of 20 million fit=522.6, lwr=359.9, upr = 758.8
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
c) Weaknesses in model: The condition of constant variance is possibly not satisfied. There are some potentially influential/high leverage points: 4, 8, 49 stand out. > quality=influence.measures(LogLog.model) > summary(quality) If you didn't run this pair of commands, but relied only on the fourth diagnostic plot provided by plot(LogLog.model), you'd have seen observations 4 and 49 flagged for potential investigation.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 11

101B_hw_4_answers_W12steven - Statistics 101B Homework four...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online