7

##### We have textbook solutions for you!

**The document you are viewing contains questions related to this textbook.**

**The document you are viewing contains questions related to this textbook.**

Expert Verified

(a) Graph airplane count (
Y
) versus helicopter count (
X
), and draw in the estimated
regress line.
(b) Carry out two
t
-tests regarding the slope
β
1
: the usual test of
H
0
:
β
1
and also a
test of
H
0
:
β
1
= 1. Which test seems more relevant to this study?
(c) Run the following code in
R
:
lm1=lm(air~heli)
new=data.frame(heli=seq(28, 88, 0.5))
pred.PI = predict(lm1, new, level=0.95, interval="prediction")
pred.CI = predict(lm1, new, level=0.95, interval="confidence")
n=10; x.mean=mean(heli); sxx=sum((heli-x.mean)^2);
half.width=summary(lm1)$sigma*sqrt(2*qf(0.95, 2, n-2))*
sqrt(1/n+(new$heli-x.mean)^2/sxx)
band.CI=cbind(pred.CI[,1]-half.width, pred.CI[,1]+half.width)
plot(c(28, 88), range(air, pred.PI, pred.CI, band.CI), type="n")
points(heli, air, xlab="Manatee Counts from Helicopter", ylab="From Airplane")
abline(lm1)
lines(new$heli, pred.PI[,2], lty=1, col="red")
lines(new$heli, pred.PI[,3], lty=1, col="red")
lines(new$heli, pred.CI[,2], lty=2, col="blue")
lines(new$heli, pred.CI[,3], lty=2, col="blue")
lines(new$heli, band.CI[,1], lty=3, col="green")
lines(new$heli, band.CI[,2], lty=3, col="green")
Explain what are plotted in the figure (you don’t need to include this figure in
your homework). Explain why the blue intervals are shorter than the red intervals.
Explain why the blue intervals are shorter than the green intervals.
(d) If the helicopter count really were accurate, and airplane observers counted no
imaginary manatees (although they might miss some real ones), the relation between
these two counts should be a regression through the origin (because when
X
= 0,
we should have
Y
= 0 too). Conduct a regression of airplane count on helicopter
count by excluding the intercept, and graph the result. Is the slope in this graph
significantly different from 1?
14. Old Faithful Geyser Data. The data set
faithful
gives information about eruptions of
the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. Variables are
eruptions
: the eruption time in minutes;
waiting
: the waiting time to the next eruption in minutes.
Fit a linear regression model to predict
waiting
from
eruptions
.
(a) What’re the estimated slope and intercept?
(b) Give an interpretation of the estimated slope. Construct a 99% CI for the slope.
(c) Report the residual standard error and the
R
2
.
8

(d) Display the fitted regression line on the scatter plot of the data points. Remember
to add proper labels for the
x
and
y
coordinates.
(e) Construct a 95% interval estimate for the average waiting time to the next eruption
for individuals who arrive at the end of an eruption which lasts 250 seconds.
(f) Mike has just arrived at the end of an eruption which lasted 250 seconds. Give a
95% interval estimate for the time Mike will have to wait for the next eruption.
(
Hint
: you first need to explain the estimated interval in this case is a PI, instead
of CI.)
15. Continue with the Old Faithful Geyser Data. From the scatter plot, we can see that the
data form two
clusters
. We divide the data into two groups based on
eruptions
<
3 or
not. Fit a linear regression model for each group.
(a) Report the estimated slope and intercept for each group.
(b) Display the fitted regression lines on the scatter plot of the data points. You should
use different line types for these two lines, and also add a legend to explain what
the two line types mean.
(c) Report the corresponding
R
2
for each group. Compare them with the
R
2
from the
regression model using all the data. Any explanation for such a big discrepancy?
Hint