STAT 352
Categorical covariates
Data
; i labels level of covariate.
Additive Model:
Alternative form of same model:
Possible linear restrictions on
's.
or
Example
Consider a small version of the car mileage example on assignment 3. Imagine we
have only th
STAT 352
Interaction effects
If a model contains terms
then we define the UV interaction term
to be
, that is, we define a new column of the design matrix which is
the product of the U column and the V column.
Something analogous can be done if one of the
STAT 352
Diagnostics
In addition to the residual plots already discussed there are a number of formal
statistical procedures available for diagnosing problems with the fitted model.
Problems with individual data points
An individual point may be a Y "outl
STAT 352
Goodness-of-fit: Pure Error Sum of Squares
If, for each (or at least sufficiently many)
combination of covariates in a data set, there are
several observations, we can carry out an extra
sum of squares F-test to see if our regression
model is ade
STAT 352
Goodness-of-fit: Pure Error Sum of Squares, An Example
Plaster hardness data of Lecture 12.
9 different covariate patterns: 3 levels of SAND and 3 levels of FIBRE.
Two ways to compute pure error sum of squares:
o
Create new variable with 9 levels
STAT 352
Fitting Interactions: Two Way ANOVA
Influence of SCHOOL, REGION on STAY
data scenic;
infile 'scenic.dat' firstobs=2;
input Stay Age Risk Culture Chest Beds
School Region Census Nurses Facil;
proc glm data=scenic;
class school region ;
model Stay
STAT 352
The SCENIC data set, continued
See Lecture 18 for plots of the data and Lecture 19 for our first analysis.
We have found that STAY, CULTURE and CHEST are significant and that we must
retain one of the three variables BED, NURSES and CENSUS which
STAT 351
The SCENIC data set
The data set is described in the Appendix of the text. Here I reproduce a page of pairwise scatter plots for all variables except the categorical variables Region and School.
It is evident from the plot that, as expected, seve
STAT 352
The SCENIC data set, continued
See Lecture 18 for plots of the data. Here we fit several models and discuss
interpretation of the coefficients.
I begin by thinking about how the variables should influence Risk. It seems natural
that risk of infec
STAT 352
Quadratic forms, Diagonalization and Eigenvalues
The function
is a quadratic form. The coefficient of a cross product term like
is
so
the function is unchanged if each of
and
is replaced by their average. In
other words we might as well assume th
STAT 352
Instructions: This is an open book test. You
may use notes, text, other books and a
calculator. Your presentations of statistical
analysis will be marked for clarity of
explanation. I expect you to explain what
assumptions you are making and to c