This preview shows page 1. Sign up to view the full content.
Unformatted text preview: FILTERS FOR ECONOMETRIC DATA
Wiener–Kolmogorov Filtering of Stationary Sequences
The classical theory of linear ﬁltering was formulated independently by Norbert
Wiener (1941) and Andrei Nikolaevich Kolmogorov (1941) during the Second
World War. They were both considering the problem of how to target radarassisted antiaircraft guns on incoming enemy aircraft.
The theory has found widespread application in analog and digital signal
processing and in telecommunications in general. Also, it has provided a basic
technique for the enhancement of recorded music.
The classical theory assumes that the data sequences are generated by
stationary stochastic processes and that these are of suﬃcient length to justify
the assumption that they constitute doublyinﬁnite sequences.
For econometrics, the theory must to be adapted to cater to short trended
sequences. Then, Wiener–Kolmogorov ﬁlters can used to extract trends from
economic data sequences and for generating seasonally adjusted data.
Consider a vector y with a signal component ξ and a noise component η :
y = ξ + η. (1) These components are assumed to be independently normally distributed with
zero means and with positivedeﬁnite dispersion matrices. Then,
E (ξ ) = 0, D(ξ ) = Ωξ , E (η ) = 0, D(η ) = Ωη , (2) and C (ξ, η ) = 0.
A consequence of the independence of ξ and η is that
D(y ) = Ωξ + Ωη and C (ξ, y ) = D(ξ ) = Ωξ . (3) The signal component is estimated by a linear transformation x = Ψx y
of the data vector that suppresses the noise component. Usually, the signal
comprises lowfrequency elements and the noise comprises elements of higher
frequencies.
The Minimum MeanSquared Error Estimator
The principle of linear minimum meansquared error estimation indicates that
the error ξ − x in representing ξ by x should be uncorrelated with the data in
y:
0 = C (ξ − x, y ) = C (ξ, y ) − C (x, y )
(4)
= C (ξ, y ) − Ψx C (y, y )
= Ωξ − Ψx (Ωξ + Ωη ).
The solution is Ψx = Ωξ (Ωξ + Ωη )−1 and the estimate of the signal is
x = Ψx y = Ωξ (Ωξ + Ωη )−1 y.
1 (5) D.S.G. POLLOCK: FILTERE FOR ECONOMETRIC DATA
The corresponding estimate of the noise component η is
h = Ψh y = Ωη (Ωξ + Ωη )−1 y
= {I − Ωξ (Ωξ + Ωη )−1 }y. (6) It will be observed that Ψξ + Ψη = I and, therefore, that x + h = y .
Conditional Expectations
In deriving the estimator, we might have used the formula for conditional expectations. In the case of two linearly related scalar random variables ξ and y ,
the conditional expectation of ξ given y is
E (ξ y ) = E (ξ ) + C (ξ, y )
{y − E (y )}
V (y ) (7) In the case of two vector quantities, this becomes
E (ξ y ) = E (ξ ) + C (ξ, y )D−1 (y ){y − E (y )} (8) By setting
C (ξ, y ) = Ωξ and D(y ) = Ωξ + Ωη as in (3), and by setting E (ξ ) = E (y ) = 0, we get the expression that is to be
found under (5):
x = Ωξ (Ωξ + Ωη )−1 y. The Diﬀerence Operator and Polynomial Regression
The lag operator L, which is commonly deﬁned in respect of a doublyinﬁnite
sequence x(t) = {xt ; t = 0 ± 1, ±2, . . .}, has the eﬀect that Lx(t) = x(t − 1).
The (backwards) diﬀerence operator ∇ = 1 − L has the eﬀect that ∇x(t) =
x(t) − x(t − 1). It serves to reduce a constant function to zero and to reduce a
linear function to a constant. The secondorder or twofold diﬀerence operator
∇2 = 1 − 2L + L2
is eﬀective in reducing a linear function to zero.
A diﬀerence operator ∇d of order d is commonly employed in the context
of an ARIMA(p, d, q ) model to reduce the data to stationarity. Then, the diﬀerenced data can be modelled by an ARMA(p, q ) process. In such circumstances,
the diﬀerence operator takes the form of a matrix transformation.
2 D.S.G. POLLOCK: FILTERS FOR ECONOMETRIC DATA 6 4 2 0
0 π/4 π/2 3π/4 π Figure 1. The squared gain of the diﬀerence operator, which has a zero at zero
frequency, and the squared gain of the summation operator, which is unbounded at
zero frequency. The Matrix Dierence Operator
The matrix analogue of the secondorder diﬀerence operator in the case of
T = 5, for example, is given by
⎤
⎡
1
0
0
00
0
0 0⎥
⎢ −2 1
⎥
⎢
Q∗
2
⎥
(9)
= ⎢ 1 −2 1
∇5 =
0 0⎥.
⎢
Q
⎦
⎣
0
1 −2 1 0
0
0
1 −2 1
The ﬁrst two rows, which do not produce true diﬀerences, are liable to be
discarded.
The diﬀerence operator nulliﬁes data elements at zero frequency and it
severely attenuates those at the adjacent frequencies. This is a disadvantage
when the low frequency elements are of primary interest. Another way of
detrending the data is to ﬁt a polynomial trend by leastsquares regression and
to take the residual sequence as the detrended data.
Polynomial Regression
Using the matrix Q deﬁned above, we can represent the vector of the ordinates
of a linear trend line interpolated through the data sequence as
x = y − Q(Q Q)−1 Q y . (10) The vector of the residuals is
e = Q(Q Q)−1 Q y . (11) Observe that this vector contains exactly the same information as the
diﬀerenced vector g = Q y . However, whereas the lowfrequency structure of
3 D.S.G. POLLOCK: FILTERE FOR ECONOMETRIC DATA 11.5
11
10.5
10
0 50 100 150 Figure 2. The quarterly series of the logarithms of consumption in the U.K., for
the years 1955 to 1994, together with a linear trend interpolated by leastsquares
regression. 8
6
4
2
0
0 π/4 π/2 3π/4 π Figure 3. The periodogram of the trended logarithmic data. 0.3
0.2
0.1
0
0 π/4 π/2 3π/4 π Figure 4. The periodogram of the diﬀerenced logarithmic consumpption data. 4 D.S.G. POLLOCK: FILTERS FOR ECONOMETRIC DATA 0.01
0.0075
0.005
0.0025
0
0 π/4 π/2 3π/4 π Figure 5. The periodogram of the residual sequence obtained from the linear detrending of the logarithmic consumption data. the data in invisible in the periodogram of the latter, it is entirely visible in
the periodogram of the residuals.
Filters for Short Trended Sequences
Applying Q to the equation y = ξ + η , representing the trended data, gives
Qy =Qξ+Qη
= δ + κ = g. (12) The vectors of the expectations and the dispersion matrices of the diﬀerenced
vectors are
E (δ ) = 0,
D(δ ) = Ωδ = Q D(ξ )Q,
(13)
E (κ) = 0,
D(κ) = Ωκ = Q D(η )Q.
The diﬃculty of estimating the trended vector ξ = y − η directly is that some
starting values or initial conditions are required in order to deﬁne the value at
time t = 0. However, since η is from a stationary meanzero process, it requires
only zerovalued initial conditions. Therefore, the startingvalue problem can
be circumvented by concentrating on the estimation of η .
The conditional expectation of η , given the diﬀerenced data g = Q y , is
provided by the formula
h = E (η g ) = E (η ) + C (η, g )D−1 (g ){g − E (g )}
= C (η, g )D−1 (g )g, (14) where the second equality follows in view of the zerovalued expectations.
Within this expression, there are
D(g ) = Ωδ + Q Ωη Q and C (η, g ) = Ωη Q.
5 (15) D.S.G. POLLOCK: FILTERE FOR ECONOMETRIC DATA 1
0.75
0.5
0.25
0
0 π/4 π/2 3π/4 π Figure 6. The gain of the Hodrick–Prescott lowpass ﬁlter with a smoothing parameter set to 100, 1,600 and 14,400. Putting these details into (14) gives the following estimate of η :
h = Ωη Q(Ωδ + Q Ωη Q)−1 Q y . (16) Putting this into the equation x = y − h gives
x = y − Ωη Q(Ωδ + Q Ωη Q)−1 Q y . (17) The Leser (H–P) Filter
We now consider two speciﬁc cases of the Wiener–Kolmogorov ﬁlter. First,
there is the Leser or Hodrick–Prescott (H–P) ﬁlter. This can be derived from
a model that supposes that the signal is generated by an integrated (secondorder) random walk and and that the noise is from a whitenoise process.
The random walk process is reduced to a whitenoise process δ (t) by taking
twofold diﬀerences. Thus, (1 − L)2 ξ (t) = δ (t), and the corresponding equation
for the sample is Q ξ = δ . Accordingly, the ﬁlter is derived by setting
2
D(η ) = Ωη = ση I, within (17) to give 2
D(δ ) = Ωδ = σδ I and λ = x = y Q(λ−1 I + Q Q)−1 Q y . 2
ση
2
σδ (18) (19) Here, λ is the socalled smoothing parameter. It will be observed that, as
λ → ∞, the vector x tends to that of a linear function interpolated into the
data by leastsquares regression, which is represented by equation (10):
x = y − Q(Q Q)−1 Q y . 6 D.S.G. POLLOCK: FILTERS FOR ECONOMETRIC DATA 1
0.75
0.5
0.25
0
0 π/4 π/2 3π/4 π Figure 7. The gain of the lowpass Butterworth ﬁlters of orders n = 6 and
n = 12 with a nominal cutoﬀ point of 2π/3 radians. Figure 6 depicts the frequency response of the lowpass H–P ﬁlter for various
values of the smoothing parameter λ. The innermost proﬁle corresponds to the
highest value of the parameter, and it represents a ﬁlter that transmits only
the data elements of lowest frequency.
For all values of λ, the response of the H–P ﬁlter shows a gradual transition
from the pass band, which corresponds to the frequencies that are transmitted
by the ﬁlter, to the stop band, which corresponds to the frequencies that are
impeded.
Often, there is a requirement for a more rapid transition as well as a need
to control the location in frequency where the transitions occurs. These needs
can be served by the Butterworth ﬁlter, which is more amenable to adjustment.
The Butterworth Filter
The Butterworth ﬁlter can be derived from an heuristic model in which the
signal and the noise are generated by processes that are described, respectively,
by the equations (1 − L)2 ξ (t) = (1 + L)n ζ (t) and (1 − L)2 η (t) = (1 − L)n ε(t),
where ζ (t) and η (t) are mutually independent whitenoise processes.
The ﬁlter that is appropriate to short trended sequences can be represented
by the equation
(20)
x = y − λΣQ(M + λQ ΣQ)−1 Q y .
Here, the matrices are
Σ = {2IT − (LT + LT )}n−2 and M = {2IT + (LT + LT )}n , (21) where LT is a matrix of order T with units on the ﬁrst subdiagonal; and it can
be veriﬁed that
(22)
Q ΣQ = {2IT − (LT + LT )}n .
Figure 7 shows the frequency response of the Butterworth ﬁlter for various
values of n and for a speciﬁc cutoﬀ frequency, which is determined by the
parameter λ. The greater the value of n, the more rapid is the transition from
pass band to stop band.
7 ...
View
Full
Document
This note was uploaded on 03/02/2012 for the course EC 7087 taught by Professor D.s.g.pollock during the Fall '11 term at Queen Mary, University of London.
 Fall '11
 D.S.G.Pollock

Click to edit the document details