Section 3
Danqing Xu
1/27/2017

Section Outline
I
Deletion residuals
I
Cook’s distance
I
Logistic Regression R Example

Case Deletion in Linear Regression
Using the notation from the last section, a subscript (i) means
“with the ith case deleted,” for examples:
I
ˆ
β
(
i
)
is the estimate of
β
computed without case
i
I
X
(
i
)
is the
(
n
-
1
)
×
p
matrix obtained from
X
by deleting the
i
th row
I
Y
(
i
)
is the
(
n
-
1
)
×
1 column vector obtained from
Y
by
deleting the
i
th element
In particular, then
ˆ
β
(
i
)
=
X
T
(
i
)
X
(
i
)
-
1
X
T
(
i
)
Y
(
i
)

Deleted Residual
If we let
I
y
i
denote the observed response for the
i
th case, and
I
ˆ
y
j
(
i
)
denote the predicted response for the
j
th case based on
the estimated model with the
i
th case deleted
then the
i
th deleted residual is defined as:
d
i
=
y
i
-
ˆ
y
i
(
i
)

Studentized Residual
I
Deleted residuals depend on the units of measurement just as
the ordinary residuals do. We can solve this problem though by
dividing each deleted residual by an estimate of its standard
deviation. That’s where “studentized residuals” come into play.
I
The studentized residual is defined as
t
i
=
y
i
-
ˆ
y
i
(
i
)
ˆ
σ
(
i
)
1
+
x
T
i
X
T
(
i
)
X
(
i
)
-
1
x
i
where
x
T
i
(dimension 1
×
p
) is the
i
th row of
X
matrix
(dimension
n
×
p
)
I
*A statistic divided by its estimated standard deviation is
usually called a
studentized statistic
, in honor of W.S.Gosset,
who first wrote about the t-distribution using the pseudonym
Student.


6

