3.5.
PARTIAL AUTOCORRELATION
65
3.5
The Partial Autocorrelation Function
Consider again the problem of forecasting
X
T
+1
from observations
X
T
, X
T

1
,
. . . , X
2
, X
1
.
Denoting, as before, the best linear predictor by
P
T
X
T
+1
=
a
1
X
T
+
a
2
X
T

1
+
a
T

1
X
2
+
a
T
X
1
, we can express
X
T
+1
as
X
T
+1
=
P
T
X
T
+1
+
Z
T
+1
=
a
1
X
T
+
a
2
X
T

1
+
a
T

1
X
2
+
a
T
X
1
+
Z
T
+1
where
Z
T
+1
denotes the forecast error which is uncorrelated with
X
T
, . . . , X
1
.
We can now ask the question whether
X
1
contributes to the forecast of
X
T
+1
controlling
for
X
T
, X
T

2
, . . . , X
2
or, equivalently, whether
a
T
is equal to zero.
Thus,
a
T
can be viewed as a measure of the importance of the additional
information provided by
X
1
. It is referred to as the
partial autocorrelation
.
In the case of an AR(p) process, the whole information useful for forecasting
X
T
+1
,
T > p
, is incorporated in the last
p
observations so that
a
T
= 0. In
the case of the MA process, the observations on
X
T
, . . . , X
1
can be used to
retrieve the unobserved
Z
T
, Z
T

1
. . . , Z
t

q
+1
.
As
Z
t
is an infinite weighted
sum of past
X
t
’s, every new observation contributes to the recovering of the
Z
t
’s. Thus, the partial autocorrelation
a
T
is not zero. Taking
T
successively
equal to 0, 1, 2, etc. we get the partial autocorrelation function (PACF).
We can, however, interpret the above equation as a regression equa
tion.
From the FrischLovellWaugh Theorem (See Davidson and MacK
innon, 1993), we can obtain
a
T
by a twostage procedure. Project (regress)
in a first stage
X
T
+1
on
X
T
, . . . , X
2
and take the residual. Similarly, project
(regress)
X
1
on
X
T
, . . . , X
2
and take the residual. The coefficient
a
T
is then
obtained by projecting (regressing) the first residual on the second. Station
arity implies that this is nothing but the correlation coefficient between the
two residuals.