Stat 373 – Ch. 4 - 1
Chapter 4 Assessing Model Fit
To this point, we have built, fit and used a model for a given set of data without
questioning any of the underlying assumptions. In this chapter, we examine the problem
of model fit. Are the assumptions reasonably well met and, if not, what do we do about
it?
In fitting the model
yx
and using the corresponding estimators
to construct formal statistical procedures, we are making a number of assumptions about
the underlying probability model
x
pp
=++
+
+
$$
....
$
$
ββ
β
01
1
1
r
I
Yx
x
R
R
N
=+
+
+
+
σ
1
2
10
,
~
( ,
)
For example, we are assuming that:
•
the mean vector
E Y
() is the specified linear function of the explanatory variates
•
the residuals are gaussian, independent with constant standard deviation for each unit
in the sample
We can assess these assumptions in several ways. If we have units in the sample in which
the explanatory variates are identical, we can use ANOVA to assess the fit. Also we can
add extra terms (squares, cross products etc.) to the proposed model and test if the
additional terms have significant effects. If not, then we have greater confidence in the
form of the mean function in the original model.
Looking at the Estimated Residuals
We also assess fit by looking for patterns that would be unusual if the model is “true”. If
we find such patterns, we are suspicious about the assumptions underlying the model.
This approach to assessing fit is informal and subjective – we need to be careful not to
over-interpret the plots looking for patterns.
The estimated residuals, the components of the vector
$
r
, are derived from the given
model.
$
$
$
ry
yX
=−=−
μβ
The corresponding estimator
~
~
()
rYX IH
R
=− = −
is a linear combination of the
components of
and hence, according to the model,
~
R
~(
, (
)
rN
IH
0
2
)
−
. Recall that
depends only on
X
. We also know that
HX
X
XX
t
=
−
1
t
$
r
and
$
μ
are orthogonal and,
according to the model,
~
r
and
~
are independent .
If we plot the individual components, the estimated residual
$
r
i
versus the fitted value
$
i
for
i
, we should see a plot with no obvious patterns.
n
=
1, .
..,
Adapted from
Stat 371 Course Notes
© R.J. MacKay University of Waterloo, 2005