This preview shows pages 1–3. Sign up to view the full content.
# This last lab will discuss the two tests that arise in the context of
# ANOVA and regression, namely the Ftest, the t.test for the
# regression coefficients, and the Confidence Interval (CI) and the
# Prediction Interval (PI) applied to the predicted value.
# 1) Let's see how we do the Ftest introduced for doing 1way
# (or 1factor) ANOVA in Ch 9.
Recall that the main question is if
# k means are all equal. I.e.,
# H0: mu1=mu2=.
..=muk,
# H1: At least two of the mu's are different.
# Here we will reproduce Table 9.1, on page 411.
# Note that the data in 9_1_dat.txt are entered in a form that is consistent
# with what I was saying about ANOVA and regression being similar, i.e.,
# 1st column is x and 2nd column is y.
dat <
read.table("http://www.stat.washington.edu/marzban/390/9_1_dat.txt",header=TRUE)
aov.1 < aov(Vibration ~ as.factor(Brand), data=dat)
summary(aov.1)
# Make sure you compare the output here
# with what we got on page 13 of lecture 28.
# You can skip this commented block, but note that similar results can be
# obtained from general linear models (glm), which constitute a generalization
# of linear regression.
#
glm.1 < glm(Vibration ~ as.factor(Brand), data=dat)
#
aov.2 < anova(glm.1)
#
aov.2
# Given the really small pvalue (.00018), we reject the null in favor of the
# alternative. I.e., at least 2 of the means are statistically different
# from the rest. Which two? Section 9.3 shows how to identify the ones
# that are statistically equivalent, but we skipped it. Visually,
# you can look at the following boxplots. This plot is a better
# version of what the book calls the "effects plot" on page 416.
# It's better because it shows not just the mean, but the 5number
# summary at each level of x.
boxplot(Vibration ~ Brand, data=dat)
# This allows for a visual comparison of the distribution of the
# 5 populations. The pvalue told us that at least 2 of the means are
# different. It's evident, for example, that the population means of
# brand 2 and 5 are probably different.
# The following performs Tukey's method (section 9.3) for identifying the
# different means. Although we skipped it, the results are easy to interpret.
# It gives CIs and pvalues for pairwise tests of population means.
# Recall, if the CI does NOT include zero, then we "conclude" that the
# two means being tested are different.
library(stats)
tuk.1 < TukeyHSD(aov.1, conf.level=0.99);
tuk.1
# Study this output! The lowerbound (lwr) and upperbound (upr) are
# given for the difference in the mean of two pops. These values are
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document# affected by conf.level in TukeyHSD(). Then, look at the pvalues in
# the last column; they test the H1 that the two means are different.
# At alpha=0.01, it's evident that the means of brand 1 and 2 are different.
This is the end of the preview. Sign up
to
access the rest of the document.
 Spring '08

Click to edit the document details