This preview shows page 1. Sign up to view the full content.
Unformatted text preview: FILTERING MACROECONOMIC DATA
By D.S.G. Pollock
University of Leicester
Email: stephen pollock@sigmapi.unet.com
This chapter sets forth the theory of linear ﬁltering together with an accompanying frequencydomain analysis. It employs the classical Wiener–
Kolmogorov theory in describing some of the ﬁlters that are used by econometricians. This theory, which was developed originally in reference to
stationary stochastic processes deﬁned on a doublyinﬁnite index set, is
adapted to cater to short nonstationary sequences. An alternative methodology of ﬁltering is also described. This operates in the frequency domain,
by altering the amplitudes of the trigonometrical functions that are the
elements of the Fourier decomposition of the detrended data. 1. Introduction
The purpose of a ﬁlter is to remove unwanted components from a stream of data
so as to enhance the clarity of the components of interest. In many engineering
applications and in some econometric applications, there is a single component
of interest, described as the signal, to which a component has been added that
can be described as the noise.
A complete separation of the signal and the noise is possible only if they
reside in separate frequency bands. It they reside in overlapping frequency
bands, then their separation is bound to be tentative. The signal typically
comprises elements of low frequency and the noise comprises elements of higher
frequencies. Filters are, therefore, designed by engineers with reference to their
frequencyselective properties.
In econometric applications, some additional components must be taken
into account. The foremost of these is the trend, which may be deﬁned as an
underlying trajectory of the data that cannot be synthesised from trigonometrical functions alone. It is diﬃcult to give a more speciﬁc deﬁnition, which
may account for the wide variety of procedures that have been proposed for
extracting trends from the economic data. A business cycle component might
also be extracted from the data; but this is often found in combination with
the trend.
Another component that is commonly present, if it has not been removed
already by the providers of the economic data, is a pattern of seasonal ﬂuctuations. In this case, given that the ﬂuctuations reside in limited frequency
bands, it is easier to provide a speciﬁc deﬁnition of the seasonal component,
albeit that there is still scope for alternative deﬁnitions.
Notwithstanding the illdeﬁned nature of these components, econometricians have tended to adopt particular models for the trend and for the seasonal
1 D.S.G. POLLOCK: Filtering Macroeconomic Data
ﬂuctuations. The trend is commonly modelled by a ﬁrstorder random walk
with drift, which is an accumulation of a whitenoise sequence of independently
and identically distributed random variables. The drift occurs when the variables have a nonzero mean—a positive mean giving rise to an upward drift.
Secondorder processes involving a double accumulation of white noise are also
used to model the trend.
Econometricians commonly model the seasonal ﬂuctuations by autoregressive movingaverage processes in which the autoregressive operator contains
complex roots with moduli of unity and with arguments that correspond to the
fundamental seasonal frequency and to the harmonically related frequencies.
The movingaverage operator is usually instrumental in conﬁning the eﬀects of
these roots to the vicinities of the seasonal frequencies.
Given a complete statistical speciﬁcation of the processes generating the
data, it is possible to derive the ﬁlters that provide the minimum meansquared
error estimates of the various components. This approach has been followed
in the TRAMO–SEATS program of Caporello and Maravall (2005) for decomposing an econometric data sequence into it components. An account of their
methods will be given in the penultimate section of this chapter.
The structural time series methodology that has been incorporated in the
STAMP computer package of Koopmans et al. (2000) follows a similar approach. The STAMP program employs the Kalman ﬁlter, which is discussed
elsewhere in this Handbook in the chapter by Tommaso Proietti. This powerful
and allencompassing method is capable of dealing with non stationary data
processes, provided that there are models to describe them.
Whereas the modelbased approach to ﬁltering has led to some reﬁned
computer programs that can often be used automatically to process the data,
there are circumstances in which a signiﬁcant mismatch occurs between the
data and the models. Then, some alternative methods must be pursued which
can be adapted more readily to the reﬂect the properties of the data. An aim
of this chapter is to describe some methods that meet this requirement.
In this chapter, we shall also employ some statistical models of the processes underlying the data. However, these will be heuristic models rather
than models that propose to be realistic. Their purpose is to enable us to
derive ﬁlters that are endowed with whatever are the appropriate frequencyselective capabilities. Thus, the speciﬁcations of the resulting ﬁlters are to be
determined ﬂexibly in the light of the properties of the data.
In deriving these ﬁlters, we use an extension of the timehonoured Wiener–
Kolmogorov principle, which is intended to provide minimum meansquared
error estimates of the components whenever these are truly described by the
models. The original of Wiener–Kolmogorov theory was based on the assumption that the data are generated by stationary stochastic processes. Therefore,
we have to adapt the theory to cater to non stationary processes.
An alternative methodology will also be described that approaches the
matter of frequency selection in a direct manner that does not depend on any
models of the data. The resulting procedures, which employ what may be
2 D.S.G. POLLOCK: Filtering Macroeconomic Data
described as frequencydomain ﬁlters, perform the essential operations upon the
trigonometrical functions that are the elements of the Fourier decomposition of
the detrended data.
An advantage of these ﬁlters is that they enable one to separate elements
that are at adjacent frequencies. Such sharp divisions of the frequency contents of the data cannot be achieved by the timedomain ﬁlters, which operate
directly on the data, without incurring severe problems of numerical instability.
Some mathematical results must be provided in order to support the analysis of ﬁltering. Some of these results will be presented at the outset in the
sections that follow this introduction. Other results will be dispersed throughout the text. We shall begin with some basic deﬁnitions.
Linear Time Invariant Filters
Whenever we form a linear combination of successive elements of a discretetime signal x(t) = {xt ; t = 0, ±1, ±2, . . .}, we are performing an operation that
is described as linear ﬁltering. In the case of a linear timeinvariant ﬁlter, such
an operation can be represented by the equation
(1) ψj x(t − j ). y (t) =
j To assist in the algebraic manipulation of such equations, we may convert the
data sequences x(t) and y (t) and the sequence of ﬁlter coeﬃcients {ψj } into
power series or polynomials. By associating z t to each element yt and by
summing the sequence, we get
yt z t = (2)
t ψj xt−j z t
t or y (z ) = ψ (z )x(z ), j where
(3) xt z t , x(z ) =
t yt z t y (z ) =
t ψj z j . and ψ (z ) =
j The convolution operation of equation (1) becomes an operation of polynomial
multiplication in equation (2). We are liable to describe the z transform ψ (z )
of the ﬁlter coeﬃcients as the transfer function of the ﬁlter.
For a treatise on the z transform, see Jury (1964).
The Impulse Response
The sequence {ψj } of the ﬁlter’s coeﬃcients constitutes its response, on the
output side, to an input in the form of a unit impulse. If the sequence is ﬁnite,
then ψ (z ) is described as a movingaverage ﬁlter or as a ﬁnite impulseresponse
(FIR) ﬁlter. When the ﬁlter produces an impulse response of an indeﬁnite
duration, it is called an inﬁnite impulseresponse (IIR) ﬁlter. The ﬁlter is said
to be causal or backwardlooking if none of its coeﬃcients is associated with
a negative power of z . In that case, the ﬁlter is available for realtime signal
processing.
3 D.S.G. POLLOCK: Filtering Macroeconomic Data
Causal Filters
A practical ﬁlter, which is constructed from a limited number of components of hardware or software, must be capable of being expressed in terms of a
ﬁnite number of parameters. Therefore, linear IIR ﬁlters which are causal will
invariably entail recursive equations of the form
p (4) q φj yt−j =
j =0 θj xt−j , with φ0 = 1, j =0 of which the z transform is
(5) φ(z )y (z ) = θ(z )x(z ), wherein φ(z ) = φ0 + φ1 z + · · · + φp z p and θ(z ) = θ0 + θz + · · · + θq z q are ﬁnitedegree polynomials. The leading coeﬃcient of φ(z ) may be set to unity without
loss of generality; and thus the output sequence y (t) in equation (4) becomes
a function not only of past and present inputs but also of past outputs, which
are described as feedback.
The recursive equation may be assimilated to the equation under (2) by
writing it in rational form:
(6) y (z ) = θ(z )
x(z ) = ψ (z )x(z ).
φ(z ) On the condition that the ﬁlter is stable, the expression ψ (z ) stands for the
series expansion of the ratio of the polynomials.
The stability of a rational transfer function θ(z )/φ(z ) can be investigated
via its partialfraction decomposition, which gives rise to a sum of simpler
transfer functions that can be analysed readily. If the degree of the numerator
of θ(z ) exceeds that of the denominator φ(z ), then long division can be used to
obtain a quotient polynomial and a remainder that is a proper rational function.
The quotient polynomial will correspond to a stable transfer function; and the
remainder will be the subject of the decomposition.
Assume that θ(z )/φ(z ) is a proper rational function in which the denominator is factorised as
r (7) (1 − z/λj )nj , φ(z ) =
j =1 where nj is the multiplicity of the root λj , and where j nj = p is the degree of
the polynomial. Then, the socalled Heaviside partialfraction decomposition
is
(9) θ(z )
=
φ(z ) r nj j =1 k=1 4 cjk
;
(1 − z/λj )k D.S.G. POLLOCK: Filtering Macroeconomic Data 2
1.5
1
0.5
0
−0.5
−1
−1.5
0 5 10 15 Figure 1. The impulse response of the transfer function θ (z )/φ(z ) with
φ(z ) = 1.0 − 1.2728z + 0.81z 2 and θ(z )(z ) = 1.0 + 0.0.75z . and the task is to ﬁnd the series expansions of the partial fractions. The stability
of the transfer function depends upon the convergence of these expansions. For
this, the necessary and suﬃcient condition is that λj  > 1 for all j , which is to
to say that all of the roots of the denominator polynomial must lie outside the
unit circle in the complex plane.
The expansions of a pair of partial fractions with conjugate complex roots
will combine to produce a sinusoidal sequence. The expansion of a partial
fraction containing a root of multiplicity n will be equivalent to the nfold
autoconvolution of the expansion of a simple fraction containing the root.
It is helpful to represent the roots of the denominator polynomial, which
are described as the poles of the transfer function, together with the roots of
the numerator polynomial, which are described as the zeros, by showing their
locations graphically within the complex plane.
It is more convenient to represent the poles and zeros of θ(z −1 )/φ(z −1 ),
which are the reciprocals of those of θ(z )/φ(z ). For a stable and invertible
transfer function, these must lie within the unit circle. This recourse has been
adopted for Figure 2, which shows the pole–zero diagram for the transfer function that gives rise to Figure 1.
The Series Expansion of a Rational Transfer Function
The method of ﬁnding the coeﬃcients of the series expansion can be illustrated by the secondorder case:
(9) θ 0 + θ1 z
= ψ0 + ψ1 z + ψ2 z 2 + · · · .
φ0 + φ1 z + φ2 z 2 We rewrite this equation as
(10) θ0 + θ1 z = φ0 + φ1 z + φ2 z 2
5 ψ0 + ψ1 z + ψ2 z 2 + · · · . D.S.G. POLLOCK: Filtering Macroeconomic Data Im
i −1 Re 1 −i
Figure 2. The pole–zero diagram corresponding to the transfer function of Figure
1. The poles are conjugate complex numbers with arguments of ±π/4 and with a
modulus of 0.9. The single realvalued zero has the value of −0.75. The following table assists us in multiplying together the two polynomials:
ψ0 ψ2 z 2 ··· φ0 φ0 ψ0 φ0 ψ1 z φ0 ψ2 z 2 ··· φ1 z φ1 ψ0 z φ1 ψ1 z 2 φ1 ψ2 z 3 ··· φ2 z 2 (11) ψ1 z φ2 ψ0 z 2 φ2 ψ1 z 3 φ2 ψ2 z 4 ··· By performing the multiplication on the RHS of equation (10), and by equating
the coeﬃcients of the same powers of z on the two sides, we ﬁnd that (12) θ0 = φ0 ψ0 ,
θ1 = φ0 ψ1 + φ1 ψ0 ,
0 = φ0 ψ2 + φ1 ψ1 + φ2 ψ0 ,
.
.
.
0 = φ0 ψn + φ1 ψn−1 + φ2 ψn−2 , ψ0 = θ0 /φ0 ,
ψ1 = (θ1 − φ1 ψ0 )/φ0 ,
ψ2 = −(φ1 ψ1 + φ2 ψ0 )/φ0 ,
.
.
.
ψn = −(φ1 ψn−1 + φ2 ψn−2 )/φ0 . Bidirectional (Non causal) Filters
A twosided symmetric ﬁlter in the form of
(13) ψ (z ) = θ(z −1 )θ(z ) = ψ0 + ψ1 (z −1 + z ) + · · · + ψm (z −m + z m )
6 D.S.G. POLLOCK: Filtering Macroeconomic Data
is often employed in smoothing the data or in eliminating its seasonal components. The advantage of such a ﬁlter is the absence of a phase eﬀect. That is
to say, no delay is imposed on any of the components of the signal.
The socalled Cram´r–Wold factorisation, which sets ψ (z ) = θ(z −1 )θ(z ),
e
and which must be available for any properlydesigned ﬁlter, provides a straightforward way of explaining the absence of a phase eﬀect. The factorisation gives
rise to two equations (i) q (z ) = θ(z )y (z ) and (ii) x(z ) = θ(z −1 )q (z ). Thus, the
transformation of (1) to be broken down into two operations:
(14) (i) qt = θj yt−j and (ii) xt = j θj qt+j .
j The ﬁrst operation, which runs in real time, imposes a time delay on every
component of x(t). The second operation, which works in reversed time, imposes an equivalent reversetime delay on each component. The reversetime
delays, which are advances in other words, serve to eliminate the corresponding
realtime delays.
If ψ (z ) corresponds to an FIR ﬁlter, then the processed sequence x(t) may
be generated via a single application of the twosided ﬁlter ψ (z ) to the signal
y (t), or it may be generated in two operations via the successive applications
of θ(z ) to y (z ) and θ(z −1 ) to q (z ) = θ(z )y (z ). The question of which of these
techniques has been used to generate y (t) in a particular instance should be a
matter of indiﬀerence.
The ﬁnal species of linear ﬁlter that may be used in the processing of
economic time series is a symmetric twosided rational ﬁlter of the form
(15) ψ (z ) = θ(z −1 )θ(z )
.
φ(z −1 )φ(z ) Such a ﬁlter must, of necessity, be applied in two separate passes running
forwards and backwards in time and described, respectively, by the equations
(16) (i) φ(z )q (z ) = θ(z )y (z ) and (ii) φ(z −1 )x(z ) = θ(z −1 )q (z ). Such ﬁlters represent a most eﬀective way of processing economic data in pursuance of a wide range of objectives.
The Response to a Sinusoidal Input
One must also consider the response of the transfer function to a simple sinusoidal signal. Any ﬁnite data sequence can be expressed as a sum of discretely
sampled sine and cosine functions with frequencies that are integer multiples
of a fundamental frequency that produces one cycle in the period spanned by
the sequence. The ﬁnite sequence may be regarded as a single cycle within a
inﬁnite sequence, which is the periodic extension of the data.
7 D.S.G. POLLOCK: Filtering Macroeconomic Data Im β λ θ
Re −θ α ρ λ*
Figure 3. The Argand Diagram showing a complex
number λ = α + iβ and its conjugate λ∗ = α − iβ . Consider, therefore, the consequences of mapping the perpetual signal
sequence {xt = cos(ωt)} through the transfer function with the coeﬃcients
{ψ0 , ψ1 , . . .}. The output is
(17) ψj cos ω [t − j ] . y (t) =
j By virtue of the trigonometrical identity cos(A − B ) = cos A cos B + sin A sin B ,
this becomes
y (t) =
(18) ψj cos(ωj ) cos(ωt) +
j ψj sin(ωj ) sin(ωt)
j = α cos(ωt) + β sin(ωt) = ρ cos(ωt − θ),
Observe that using the trigonometrical identity to expand the ﬁnal expression
of (18) gives α = ρ cos(θ) and β = ρ sin(θ). Therefore,
(19) ρ2 = α2 + β 2 and θ = tan−1 β
.
α Also, if λ = α + iβ and λ∗ = α − iβ are conjugate complex numbers, then ρ
would be their modulus. This is illustrated in Figure 3.
It can be seen, from (18), that the transfer function has a twofold eﬀect
upon the signal. First, there is a gain eﬀect, whereby the amplitude of the
sinusoid is increased or diminished by the factor ρ. Then, there is a phase
eﬀect, whereby the peak of the sinusoid is displaced by a time delay of θ/ω
periods. The frequency of the output is the same as the frequency of the input,
which is a fundamental feature of all linear dynamic systems.
8 D.S.G. POLLOCK: Filtering Macroeconomic Data 1.0
0.5 1 2 3 4 −0.5
−1.0 Figure 4. The values of the function cos{(11/8)πt} coincide with those
of its alias cos{(5/8)πt} at the integer points {t = 0, ±1, ±2, . . .}. Observe that the response of the transfer function to a sinusoid of a particular frequency is akin to the response of a bell to a tuning fork. It gives very
limited information regarding the characteristics of the system. To obtain full
information, it is necessary to excite the system over a full range of frequencies.
Aliasing and the Shannon–Nyquist Sampling Theorem
In a discretetime system, there is a problem of aliasing whereby signal
frequencies (i.e. angular velocities) in excess of π radians per sampling interval
are confounded with frequencies within the interval [0, π ]. To understand this,
consider a cosine wave of unit amplitude and zero phase with a frequency ω in
the interval π < ω < 2π that is sampled at unit intervals. Let ω ∗ = 2π − ω .
Then,
cos(ωt) = cos (2π − ω ∗ )t
(20) = cos(2π ) cos(ω ∗ t) + sin(2π ) sin(ω ∗ t)
= cos(ω ∗ t); which indicates that ω and ω ∗ are observationally indistinguishable. Here,
ω ∗ ∈ [0, π ] is described as the alias of ω > π .
The maximum frequency in discrete data is π radians per sampling interval
and, as the Shannon–Nyquist sampling theorem indicates, aliasing is avoided
only if there are at least two observations in the time that it takes the signal
element of highest frequency to complete a cycle. In that case, the discrete
representation will contain all of the available information on the system.
The consequences of sampling at an insuﬃcient rate are illustrated in Figure 4. Here, a rapidly alternating cosine function is mistaken for one of less
than half the true frequency.
The sampling theorem is attributable to several people, but it is most
commonly attributed to Shannon (1949, 1989), albeit that Nyquist (1928) discovered the essential results at an earlier date.
9 D.S.G. POLLOCK: Filtering Macroeconomic Data 25
20
15
10
5
0
0 π/4 π/2 3π/4 π Figure 5. The spectral density function of the ARMA(2, 1) process
y (t) = 1.2728y (t − 1) − 0.81y (t − 2) + ε(t) + 0.0.75ε(t − 1) with V {ε(t)} = 1. The Frequency Response of a Linear Filter
The frequency response of a linear ﬁlter ψ (z ) is its response to the set of
sinusoidal inputs of all frequencies ω that fall within the Nyquist interval [0, π ].
This entails the squared gain of the ﬁlter, deﬁned by
2
2
ρ2 (ω ) = ψα (ω ) + ψβ (ω ), (21)
where
(22) ψα (ω ) = ψj cos(ωj ) and ψβ (ω ) =
j ψj sin(ωj ),
j and the phase displacement, deﬁned by
(23) θ(ω ) = Arg{ψ (ω )} = tan−1 {ψβ (ω )/ψα (ω )}. It is convenient to replace the trigonometrical functions of (22) by the
complex exponential functions
1
1
(24)
eiωj = {cos(ωj ) + sin(ωj )} and e−iωj = {cos(ωj ) − sin(ωj )},
2
2
which enable the trigonometrical functions to be expressed as
i
1
(25)
cos(ωt) = {eiωj + e−iωj } and sin(ωj ) = {e−iωj − eiωj }.
2
2
Setting z = exp{−iωj } in ψ (z ) gives
(26) ψ (e−iωj ) = ψα (ω ) − iψβ (ω ), which we shall write hereafter as ψ (ω ) = ψ (e−iωj ).
The squared gain of the ﬁlter, previously denoted by ρ2 (ω ), is the square
of the complex modulus:
(27) 2
2
ψ (ω )2 = ψα (ω ) + ψβ (ω ), which is obtained by setting z = exp{−iωj } in ψ (z −1 )ψ (z ).
10 D.S.G. POLLOCK: Filtering Macroeconomic Data
The Spectrum of a Stationary Stochastic Process
Consider a stationary stochastic process y (t) = {yt ; t = 0, ±1, ±2, . . .}
deﬁned on a doublyinﬁnite index set. The generic element of the process can
be expressed as yt = j ψj εt−j , where εt is an element of a sequence ε(t) of
independently and identically distributed random variables with E (εt ) = 0 and
V (εt ) = σ 2 for all t.
The autocovariance generating function of the process is
(28) σ 2 ψ (z −1 )ψ (z ) = γ (z ) = {γ0 + γ1 (z −1 + z ) + γ2 (z −2 + z 2 ) + · · ·}. The following table assists us in forming the product γ (z ) = σ 2 ψ (z −1 )ψ (z ):
ψ0 ψ2 z 2 ··· ψ0 2
ψ0 ψ0 ψ1 z ψ0 ψ2 z 2 ··· ψ1 z −1 ψ1 ψ0 z −1 2
ψ1 ψ1 ψ2 z ··· ψ2 z −2
.
.
. (29) ψ1 z ψ2 ψ0 z −2
.
.
. ψ2 ψ1 z −1
.
.
. 2
ψ2
.
.
. ··· The autocovariances are obtained by summing along the NW–SE diagonals:
2
2
2
2
γ0 = σ 2 {ψ0 + ψ1 + ψ2 + ψ3 + · · ·}, (30) γ1 = σ 2 {ψ0 ψ1 + ψ1 ψ2 + ψ2 ψ3 + · · ·},
γ2 = σ 2 {ψ0 ψ2 + ψ1 ψ3 + ψ2 ψ4 + · · ·},
.
.
. By setting z = exp{−iωj } and dividing by 2π , we get the spectral density
function, or spectrum, of the process:
∞ (31) 1
f (ω ) =
γτ cos(ωτ ) .
γ0 + 2
2π
τ =1 This entails the cosine Fourier transform of the sequence of autocovariances.
The spectral density functions of an ARMA (2, 1) process, which incorporates the transfer function of Figures 1–3, is shown in Figure 5.
Wiener–Kolmogorov Filtering of Stationary Sequences
The classical theory of linear ﬁltering was formulated independently by
Norbert Wiener (1941) and Andrei Nikolaevich Kolmogorov (1941) during the
Second World War. They were both considering the problem of how to target
radarassisted antiaircraft guns on incoming enemy aircraft.
11 D.S.G. POLLOCK: Filtering Macroeconomic Data
The purpose of a Wiener–Kolmogorov (W–K) ﬁlter is to extract an estimate of a signal sequence ξ (t) from an observable data sequence
(32) y (t) = ξ (t) + η (t), which is aﬄicted by the noise η (t). According to the classical assumptions,
which we shall later amend in order to accommodate short nonstationary sequences, the signal and the noise are generated by zeromean stationary stochastic processes that are mutually independent. Also, the assumption is made that
the data constitute a doublyinﬁnite sequence. It follows that the autocovariance generating function of the data is the sum of the autocovariance generating
functions of its two components. Thus,
(33) γ yy (z ) = γ ξξ (z ) + γ ηη (z ) and γ ξξ (z ) = γ yξ (z ). These functions are amenable to the socalled Cram´r–Wold factorisation, and
e
they may be written as
(34) γ yy (z ) = φ(z −1 )φ(z ), γ ξξ (z ) = θ(z −1 )θ(z ), γ ηη (z ) = θη (z −1 )θη (z ). The estimate xt of the signal element ξt , generated by a linear timeinvariant ﬁlter, is a linear combination of the elements of the data sequence:
(35) xt = ψj yt−j .
j The principle of minimum meansquared error estimation indicates that the
estimation errors must be statistically uncorrelated with the elements of the
information set. Thus, the following condition applies for all k :
0 = E yt−k (ξt − xt )
(36) = E (yt−k ξt ) − ψj E (yt−k yt−j )
j = yξ
γk yy
ψj γk−j . −
j The equation may be expressed, in terms of the z transforms, as
(37) γ y (z ) = ψ (z )γ yy (z ). It follows that (38) γ yξ (z )
ψ (z ) = yy
γ (z )
θ(z −1 )θ(z )
γ ξξ (z )
=
.
= ξξ
γ (z ) + γ ηη (z )
φ(z −1 )φ(z )
12 D.S.G. POLLOCK: Filtering Macroeconomic Data
Now, by setting z = exp{−iω }, one can derive the frequencyresponse
function of the ﬁlter that is used in estimating the signal ξ (t). The eﬀect of
the ﬁlter is to multiply each of the frequency elements of y (t) by the fraction
of its variance that is attributable to the signal. The same principle applies to
the estimation of the residual or noise component. This is obtained using the
complementary ﬁlter
(39) ψ c (z ) = 1 − ψ (z ) = γ ηη (z )
.
γ ξξ (z ) + γ ηη (z ) The estimated signal component may be obtained by ﬁltering the data in
two passes according to the following equations:
(40) φ(z )q (z ) = θ(z )y (z ), φ(z −1 )x(z −1 ) = θ(z −1 )q (z −1 ). The ﬁrst equation relates to a process that runs forwards in time to generate
the elements of an intermediate sequence, represented by the coeﬃcients of
q (z ). The second equation represents a process that runs backwards to deliver
the estimates of the signal, represented by the coeﬃcients of x(z ).
The Hodrick–Prescott (Leser) Filter and the Butterworth Filter
The Wiener–Kolmogorov methodology can be applied to nonstationary
data with minor adaptations. A model of the processes underlying the data
can be adopted that has the form of
(41) ∇d (z )y (z ) = ∇d (z ){ξ (z ) + η (z )} = δ (z ) + κ(z )
= (1 + z )n ζ (z ) + (1 − z )m ε(z ), where ζ (z ) and ε(z ) are the z transforms of two independent whitenoise sequences ζ (t) and ε(t) and where ∇ = 1 − z is the z transform of the diﬀerence
operator.
The model of y (t) = ξ (t) + η (t) entails a pair of statistically independent
stochastic processes, which are deﬁned over the doublyinﬁnite sequence of
integers and of which the z transforms are
(42) (1 + z )n
ζ (z )
ξ (z ) =
∇d (z ) (1 − z )m
and η (z ) =
ε(z ).
∇d (z ) The condition m ≥ d is necessary to ensure the stationarity of η (t), which is
obtained from ε(t) by diﬀerencing m − d times.
It must be conceded that a nonstationary process such as ξ (t) is a mathematical construct of doubtful reality, since its values will be unbounded, almost
certainly. Nevertheless, to deal in these terms is to avoid the complexities of
the ﬁnitesample approach, which will be the subject of the next section.
13 D.S.G. POLLOCK: Filtering Macroeconomic Data
The ﬁlter that is applied to y (t) to estimate ξ (t), which is the dfold integral
of δ (t), takes the form of
(43) 2
σζ (1 + z −1 )n (1 + z )n
ψ (z ) = 2
,
2
σζ (1 + z −1 )n (1 + z )n + σε (1 − z −1 )m (1 − z )m regardless of the degree d of diﬀerencing that would be necessary to reduce y (t)
to stationarity.
Two special cases are of interest. By setting d = m = 2 and n = 0 in (41),
a model is obtained of a secondorder random walk ξ (t) aﬀected by whitenoise
errors of observation η (t) = ε(t). The resulting lowpass W–K ﬁlter, in the form
of
(44) ψ (z ) = 1
1 + λ(1 − z −1 )2 (1 − z )2 with λ = 2
ση
2,
σδ is the Hodrick–Prescott (H–P) ﬁlter. The complementary highpass ﬁlter, which
generates the residue, is
ψ c (z ) = (45) (1 − z −1 )2 (1 − z )2
.
λ−1 + (1 − z −1 )2 (1 − z )2 Here, λ, which is described as the smoothing parameter, is the single adjustable
parameter of the ﬁlter.
By setting m = n, a ﬁlter for estimating ξ (t) is obtained that takes the
form of
ψ (z ) =
(46)
= 2
σζ (1 + z −1 )n (1 + z )n
2
2
σζ (1 + z −1 )n (1 + z )n + σε (1 − z −1 )n (1 − z )n 1
1−z
1+λ i
1+z with λ = 2n 2
σε
2.
σζ This is the formula for the Butterworth lowpass digital ﬁlter. The ﬁlter has
two adjustable parameters, and, therefore, it is a more ﬂexible device than the
H–P ﬁlter. First, there is the parameter λ. This can be expressed as
(47) λ = {1/ tan(ωd )}2n , where ωd is the nominal cutoﬀ point of the ﬁlter, which is the mid point in
the transition of the ﬁlter’s frequency response from its pass band to its stop
band. The second of the adjustable parameters is n, which denotes the order
of the ﬁlter. As n increases, the transition between the pass band and the stop
band becomes more abrupt.
14 D.S.G. POLLOCK: Filtering Macroeconomic Data 1
0.75
0.5
0.25
0
0 π/4 π/2 3π/4 π Figure 6. The gain of the Hodrick–Prescott lowpass ﬁlter with a smoothing parameter set to 100, 1,600 and 14,400. These ﬁlters can be applied to the nonstationary data sequence y (t) in the
bidirectional manner indicated by equation (40), provided that the appropriate
initial conditions are supplied with which to start the recursions. However,
by concentrating on the estimation of the residual sequence η (t), which corresponds to a stationary process, it is possible to avoid the need for nonzero
initial conditions. Then, the estimate of η (t) can be subtracted from y (t) to
obtain the estimate of ξ (t).
The H–P ﬁlter has been used as a lowpass smoothing ﬁlter in numerous
macroeconomic investigations, where it has been customary to set the smoothing parameter to certain conventional values. Thus, for example, the econometric computer package Eviews 4.0 (2000) imposes the following default values:
⎧
for annual data,
⎪ 100
⎨
λ = 1,600 for quarterly data,
⎪
⎩
14,400 for monthly data.
Figure 6 shows the square gain of the ﬁlter corresponding to these values. The
innermost curve corresponds to λ = 14,400 and the outermost curve to λ = 100.
Whereas they have become conventional, these values are arbitrary. The
ﬁlter should be adapted to the purpose of isolating the component of interest;
and the appropriate ﬁlter parameters need to be determined in the light of the
spectral structure of the component, such as has been revealed in Figure 10, in
the case of the U.K. consumption data.
It will be observed that an H–P ﬁlter with λ = 1,600, which deﬁnes the
middle curve in Figure 6, will not be eﬀective in isolating the lowfrequency
component of the quarterly consumption data of Figure 9, which lies in the
interval [0, π/8]. The curve will cut through the lowfrequency spectral structure that is represented in Figure 10; and the eﬀect will be greatly to attenuate
some of the elements of the component that should be preserved intact.
15 D.S.G. POLLOCK: Filtering Macroeconomic Data 1
0.75
0.5
0.25
0
0 π/4 π/2 3π/4 π Figure 7. The squared gain of the lowpass Butterworth ﬁlters of orders
n = 6 and n = 12 with a nominal cutoﬀ point of 2π/3 radians. Lowering the value of λ in order to admit a wider range of frequencies
will have the eﬀect of creating a frequency response with a gradual transition
from the pass band to the stop band. This will be equally inappropriate to the
purpose of isolating a component within a welldeﬁned frequency band. For
that purpose, a diﬀerent ﬁlter is required.
A ﬁlter that may be appropriate to the purpose of isolating the lowfrequency ﬂuctuations in consumption is the Butterworth ﬁlter. The squared
gain of the latter is illustrated in Figure 7. In this case, there is a welldeﬁned
nominal cutoﬀ frequency, which is at the mid point of the transition from
the pass band to the stop band. The transition becomes more rapid as the
ﬁlter order n increases. If a perfectly sharp transition is required, then the
frequencydomain ﬁlter that will be presented later should be employed.
The Hodrick–Prescott ﬁlter has many antecedents. Its invention cannot
reasonably be attributed to Hodrick and Prescott (1980, 1997), who cited Whittaker (1923) as one of their sources. Leser (1961) also provided a complete
derivation of the ﬁlter at an earlier date. The analogue Butterworth ﬁlter is a
commonplace of electrical engineering. The digital version has been described
by Pollock (2000).
Wiener–Kolmogorov Filters for Finite Sequences
The classical Wiener–Kolmogorov theory can be adapted to ﬁnite data
sequences generated by stationary stochastic processes.
Consider a data vector y = [y0 , y1 , . . . , yT −1 , ] that has a signal component
ξ and a noise component η :
(48) y = ξ + η. The two components are assumed to be independently normally distributed
16 D.S.G. POLLOCK: Filtering Macroeconomic Data
with zero means and with positivedeﬁnite dispersion matrices. Then,
E (ξ ) = 0,
E (η ) = 0, (49) D(ξ ) = Ωξ ,
D(η ) = Ωη , and C (ξ, η ) = 0.
The dispersion matrices Ωξ and Ωη may be obtained from the autocovariance generating functions γξ (z ) and γη (z ), respectively, by replacing z by
the matrix argument LT = [e1 , e2 , . . . , eT −1 , 0], which is the ﬁnite sampleversion of the lag operator. This is obtained from the identity matrix IT =
[e0 , e1 , e2 , . . . , eT −1 ] by deleting the leading column and by appending a zero
vector to the end of the array. Negative powers of z are replaced by powers of
the forwards shift operator FT = LT . A consequence of the independence of ξ
and η is that D(y ) = Ωξ + Ωη .
We may begin by considering the determination of the vector of the T
ﬁlter coeﬃcients ψt. = [ψt,0 , ψt,1 , . . . , ψt,T −1 ] that determine xt , which is the tth
element of the ﬁltered vector x = [x0 , x1 , . . . , xT −1 ] and which is the estimate
of ξt . This is derived from the data in y = [y0 , y1 , . . . , yT −1 ] via the equation
T −1−t xt = (50) ψt,t+j yt−j .
j =−t The principle of minimum meansquared error estimation continues to indicate that the estimation errors must be statistically uncorrelated with the
elements of the information set. Thus
0 = E yt−k (ξt − xt )
T −1−t = E (yt−k ξt ) − (51) ψt,t+j E (yt−k yt−j )
j =−t T −1+t
ξξ
= γ−k − j =−t yy
ψt,t+j γj −k . yξ
ξξ
Here, E (yt−k ξt ) = γ−k = γ−k in accordance with (33). Equation (51) can be
rendered also in a matrix format. By running from k = −t to k = T − 1 − t,
ξξ
ξξ
and observing that γ−k = γk , we get the following system: ⎡
(52) ξξ
γt ⎤ ⎡ yy
γ0 ⎢ ξξ ⎥ ⎢ yy
⎢ γt−1 ⎥ ⎢ γ1
⎥=⎢
⎢
⎥⎢.
⎢
.
.
⎦⎣.
⎣
.
.
yy
ξξ
γT −1
γT −1−t yy
γ1
yy
γ0
.
.
.
yy
γT −2 17 ⎤⎡ ψ
⎤
t,0
⎥
yy ⎥ ⎢
· · · γT −2 ⎥ ⎢ ψt,1 ⎥
⎥⎢
⎥
. ⎥⎢ . ⎥.
..
. ⎦⎣ . ⎦
.
.
.
yy
· · · γ0
ψt,T −1
yy
· · · γT −1 D.S.G. POLLOCK: Filtering Macroeconomic Data
This equation above can be written in summary notation as Ωξ et = Ωy ψt. ,
where et is a vector of order T containing a single unit preceded by t zeros and
followed by T − 1 − t zeros. The coeﬃcient vector ψt. is given by
(53) ψt. = et Ωξ Ω−1 = et Ωξ (Ωξ + Ωη )−1 ,
y and the estimate of ξt is xt = ψt. y . The estimate of the complete vector
ξ = [ξ0 , ξ1 , . . . , ξT −1 ] of the signal elements is
(54) x = Ωξ Ω−1 y = Ωξ (Ωξ + Ωη )−1 y.
y The Estimates as Conditional Expectations
The linear estimates of (54) have the status of conditional expectations,
when the vectors ξ and y are normally distributed. As such, they are, unequivocally, the optimal minimum meansquared error predictors of the signal and
the noise components:
(55) E (ξ y ) = E (ξ ) + C (ξ, y )D−1 (y ){y − E (y )}
= Ωξ (Ωξ + Ωη )−1 y = x, (56) E (η y ) = E (η ) + C (η, y )D−1 (y ){y − E (y )}
= Ωη (Ωξ + Ωη )−1 y = h. The corresponding error dispersion matrices, from which conﬁdence intervals for the estimated components may be derived, are
(57) D(ξ y ) = D(ξ ) − C (ξ, y )D−1 (y )C (y, ξ )
= Ωξ − Ωξ (Ωξ + Ωη )−1 Ωξ , (58) D(η y ) = D(η ) − C (η, y )D−1 (y )C (y, η ),
= Ωη − Ωη (Ωξ + Ωη )−1 Ωη . The LeastSquares Derivation of the Estimates
The estimates of ξ and η , which have been denoted by x and h respectively,
can also be derived according to the following criterion:
(59) Minimise S (ξ, η ) = ξ Ω−1 ξ + η Ω−1 η
η
ξ subject to ξ + η = y. Since S (ξ, η ) is the exponent of the normal joint density function N (ξ, η ), the
resulting estimates may be described, alternatively, as the minimum chisquare
estimates or as the maximumlikelihood estimates.
Substituting for η = y − ξ gives the concentrated criterion function S (ξ ) =
−1
ξ Ωξ ξ + (y − ξ ) Ω−1 (y − ξ ). Diﬀerentiating this function in respect of ξ and
η
18 D.S.G. POLLOCK: Filtering Macroeconomic Data 6 4 2 0
0 π/4 π/2 3π/4 π Figure 8. The squared gain of the diﬀerence operator, which has a zero at zero
frequency, and the squared gain of the summation operator, which is unbounded at
zero frequency. setting the result to zero gives the following condition of minimisation: 0 =
Ω−1 x − Ω−1 (y − x). From this, it follows that y = (Ωξ + Ωη )Ω−1 x. Therefore,
η
ξ
ξ
the solution for x is
(60) x = Ωξ (Ωξ + Ωη )−1 y. Moreover, since the roles of ξ and η are interchangeable in this exercise, and
since h + x = y , there are also
(61) h = Ωη (Ωξ + Ωη )−1 y and x = y − Ωη (Ωξ + Ωη )−1 y. The ﬁlter matrices Ψξ = Ωξ (Ωξ + Ωη )−1 and Ψη = Ωη (Ωξ + Ωη )−1 of (60) and
(61) are the matrix analogues of the z transforms displayed in equations (38)
and (39).
A simple procedure for calculating the estimates x and h begins by solving
the equation
(62) (Ωξ + Ωη )b = y for the value of b. Thereafter, one can generate
(63) x = Ωξ b and h = Ωη b. If Ωξ and Ωη correspond to the narrowband dispersion matrices of movingaverage processes, then the solution to equation (62) may be found via a
Cholesky factorisation that sets Ωξ + Ωη = GG , where G is a lowertriangular
matrix with a limited number of nonzero bands. The system GG b = y may be
cast in the form of Gp = y and solved for p. Then, G b = p can be solved for b.
19 D.S.G. POLLOCK: Filtering Macroeconomic Data
The Diﬀerence and Summation Operators
A simple expedient for eliminating the trend from the data sequence y (t) =
{yt ; t = 0, ±1, ±2, . . .} is to replace the sequence by its diﬀerenced version
y (t) − y (t − 1) or by its twice diﬀerenced version y (t) − 2y (t − 1) + y (t −
2). Diﬀerences of higher orders are rare. The z transform of the diﬀerence is
(1 − z )y (z ) = y (z ) − zy (z ). On deﬁning the operator ∇(z ) = 1 − z , the second
diﬀerences can be expressed as ∇2 (z )y (t) = (1 − 2z + z 2 )y (z ).
The inverse of the diﬀerence operator is the summation operator
Σ(z ) = (1 − z )−1 = {1 + z + z 2 + · · ·}. (64) The z transform of the dfold summation operator is as follows:
(65) Σd (z ) = 1
d(d + 1) 2 d(d + 1)(d + 2) 3
= 1 + dz +
z+
z + ···.
(1 − z )d
2!
3! The diﬀerence operator has a powerful eﬀect upon the data. It nulliﬁes
the trend and it severely attenuates the elements of the data that are adjacent
in frequency to the zero frequency of the trend. It also ampliﬁes the high
frequency elements of the data. The eﬀect is apparent in Figure 8, which
shows the squared gain of the diﬀerence operator. The ﬁgure also shows the
squared gain of the summation operator, which gives unbounded power to the
elements that have frequencies in the vicinity of zero.
In dealing with a ﬁnite sequence, it is appropriate to consider a matrix
version of the diﬀerence operator. In the case of a sample of T elements comprised by the vector y = [y0 , y1 , . . . , yT −1 ] , it is appropriate to use the matrix
diﬀerence operator ∇(LT ) = IT − LT , which is obtained by replacing z within
∇(z ) = 1 − z by the matrix argument LT = [e1 , e2 , . . . , eT −1 , 0], which is
obtained from the identity matrix IT = [e0 , e1 , e2 , . . . , eT −1 ] by deleting the
leading column and by appending a zero vector to the end of the array.
Examples of the ﬁrstorder and secondorder matrix diﬀerence operators
are as follows:
⎡
⎤
⎡
⎤
1
0
00
1
0
00
0 0⎥
0 0⎥
⎢ −1 1
⎢ −2 1
2
(66)
∇4 = ⎣
⎦ , ∇4 = ⎣
⎦.
0 −1 1 0
1 −2 1 0
0
0 −1 1
0
1 −2 1
The corresponding inverse matrices are
⎡
⎤
⎡
1000
1
⎢1 1 0 0⎥
⎢2
2
(67)
Σ4 = ⎣
⎦ , Σ4 = ⎣
1110
3
1111
4 0
1
2
3 0
0
1
2 ⎤
0
0⎥
⎦.
0
1 It will be seen that the elements of the leading vectors of these matrices are
the coeﬃcients associated with the expansion of Σd (z ) of (65) for the cases of
d = 1 and d = 2. The same will be true for higher orders of d.
20 D.S.G. POLLOCK: Filtering Macroeconomic Data
Polynomial Interpolation
The ﬁrst p columns of the matrix Σp provide a basis of the set of polynoT
mials of degree p − 1 deﬁned on the set of integers t = 0, 1, 2, . . . , T − 1. An
example is provided by the ﬁrst three columns of the matrix Σ3 , which may be
4
transformed as follows:
⎡
⎤
⎤ ⎡1 1 1 ⎤
100⎡
1
11
⎢ 3 1 0⎥⎣
⎢1 2 4 ⎥
(68)
⎣
⎦ −2 −1 1 ⎦ = ⎣
⎦.
631
139
1
00
10 6 3
1 4 16
The ﬁrst column of the matrix on the LHS contains the ordinates of the
quadratic function (t2 + t)/2. The columns of the transformed matrix are
recognisably the ordinates of the powers t0 , t1 and t2 corresponding to the integers t = 1, 2, 3, 4. The natural extension of the matrix to T rows provides a
basis for the quadratic functions q (t) = at2 + bt + c deﬁned on T consecutive
integers.
The matrix of the powers of the integers is notoriously illconditioned.
In calculating polynomial regressions of any degree in excess of the cubic, it
is advisable to employ a basis of orthogonal polynomials, for which purpose
some specialised numerical procedures are available. (See Pollock 1999.) In
the present context, which concerns econometric data sequences, the degrees of
diﬀerencing and summation rarely exceed two. Nevertheless, it is appropriate
to consider the algebra of the general case.
Consider, therefore, the matrix that takes the pth diﬀerence of a vector
of order T , which is
∇p = (I − LT )p .
T (69) This matrix can be partitioned so that ∇p = [Q∗ , Q] , where Q∗ has p rows. If
T
y is a vector of T elements, then
Q∗
g
y= ∗ ;
Q
g ∇p y =
T (70) and g∗ is liable to be discarded, whereas g will be regarded as the vector of the
pth diﬀerences of the data.
The inverse matrix may be partitioned conformably to give ∇−p = [S∗ , S ].
T
It follows that
(71) [ S∗ S] Q∗
Q = S∗ Q∗ + SQ = IT , and that
(72) Q∗
[ S∗
Q S]= Q∗ S∗
Q S∗
21 Q∗ S
QS = Ip
0 0
IT −p . D.S.G. POLLOCK: Filtering Macroeconomic Data
If g∗ is available, then y can be recovered from g via
(73) y = S∗ g∗ + Sg. Since the submatrix S∗ , provides a basis for all polynomials of degree
p − 1 that are deﬁned on the integer points t = 0, 1, . . . , T − 1, it follows that
S∗ g∗ = S∗ Q∗ y contains the ordinates of a polynomial of degree p − 1, which is
interpolated through the ﬁrst p elements of y , indexed by t = 0, 1, . . . , p − 1,
and which is extrapolated over the remaining integers t = p, p + 1, . . . , T − 1.
A polynomial that is designed to ﬁt the data should take account of all
of the observations in y . Imagine, therefore, that y = φ + η , where φ contains
the ordinates of a polynomial of degree p − 1 and η is a disturbance term
2
with E (η ) = 0 and D(η ) = ση IT . Then, in forming an estimate x = S∗ r∗ of
φ, we should minimise the sum of squares η η . Since the polynomial is fully
determined by the elements of a startingvalue vector r∗ , this is a matter of
minimising
(74) (y − x) (y − x) = (y − S∗ r∗ ) (y − S∗ r∗ ) with respect to r∗ . The resulting values are
(75) r∗ = (S∗ S∗ )−1 S∗ y and x = S∗ (S∗ S∗ )−1 S∗ y. An alternative representation of the estimated polynomial is available.
This is provided by the identity
(76) S∗ (S∗ S∗ )−1 S∗ = I − Q(Q Q)−1 Q . To prove this identity, consider the fact that Z = [Q, S∗ ] is square matrix
of full rank and that Q and S∗ are mutually orthogonal such that Q S∗ = 0.
Then
Z (Z Z )−1 Z = [ Q
(77) S∗ ] (Q Q)−1
0 0
(S∗ S )−1 Q
S∗ = Q(Q Q)−1 Q + S∗ (S∗ S∗ )−1 S∗ .
The result of (76) follows from the fact that Z (Z Z )−1 Z = Z (Z −1 Z −1 )Z = I .
It follows from (76) that the vector of the ordinates of the polynomial regression
is also given by
(78) x = y − Q(Q Q)−1 Q y . Polynomial Regression and Trend Extraction
The use of polynomial regression in a preliminary detrending of the data
is an essential part of a strategy for determining an appropriate representation
22 D.S.G. POLLOCK: Filtering Macroeconomic Data 11.5
11
10.5
10
0 50 100 150 Figure 9. The quarterly series of the logarithms of consumption in the U.K., for
the years 1955 to 1994, together with a linear trend interpolated by leastsquares
regression. 0.01
0.0075
0.005
0.0025
0
0 π/4 π/2 3π/4 π Figure 10. The periodogram of the residual sequence obtained from the linear
detrending of the logarithmic consumption data. of the underlying trajectory of an econometric data sequence. Once the trend
has been eliminated from the data, one can proceed to assess their spectral
structure by examining the periodogram of the residual sequence.
Often the periodogram will reveal the existence of a cutoﬀ frequency that
bounds a lowfrequency trend/cycle component and separates it from the remaining elements of the spectrum.
An example is given in Figures 9 and 10. Figure 9 represents the logarithms of the quarterly data on aggregate consumption in the United Kingdom
for the years 1955 to 1994. Through these data, a linear trend has been interpolated by leastsquares regression. This line establishes a benchmark of constant
exponential growth, against which the ﬂuctuations of consumption can be measured. The periodogram of the residual sequence in plotted in Figure 10. This
shows that the lowfrequency structure is bounded by a frequency value of
23 D.S.G. POLLOCK: Filtering Macroeconomic Data
π/8. This value can used in specifying the appropriate ﬁlter for extracting the
lowfrequency trajectory of the data
Filters for Short Trended Sequences
One way of eliminating the trend is to take diﬀerences of the data. Usually,
twofold diﬀerencing is appropriate. The matrix analogue of the secondorder
backwards diﬀerence operator in the case of T = 5 is given by
⎡ (79) ∇2 =
5 Q∗
Q 1
0
⎢ −2 1
⎢
= ⎢ 1 −2
⎢
⎣
0
1
0
0 0
0
1
−2
1 ⎤
0
0⎥
⎥
⎥
0 0⎥.
⎦
10
−2 1
0
0 The ﬁrst two rows, which do not produce true diﬀerences, are liable to be
discarded. In general, the pfold diﬀerences of a data vector of T elements will
be obtained by pre multiplying it by a matrix Q of order (T − p) × T . Applying
Q to the equation y = ξ + η , representing the trended data, gives
Qy =Qξ+Qη
= δ + κ = g. (80) The vectors of the expectations and the dispersion matrices of the diﬀerenced
vectors are
(81) E (δ ) = 0, D(δ ) = Ωδ = Q D(ξ )Q, E (κ) = 0, D(κ) = Ωκ = Q D(η )Q. The diﬃculty of estimating the trended vector ξ = y − η directly is that
some starting values or initial conditions are required in order to deﬁne the
value at time t = 0. However, since η is from a stationary meanzero process,
it requires only zerovalued initial conditions. Therefore, the startingvalue
problem can be circumvented by concentrating on the estimation of η . The
conditional expectation of η , given the diﬀerenced data g = Q y , is provided
by the formula
(82) h = E (η g ) = E (η ) + C (η, g )D−1 (g ){g − E (g )}
= C (η, g )D−1 (g )g, where the second equality follows in view of the zerovalued expectations.
Within this expression, there are
(83) D(g ) = Ωδ + Q Ωη Q and C (η, g ) = Ωη Q.
24 D.S.G. POLLOCK: Filtering Macroeconomic Data
Putting these details into (82) gives the following estimate of η :
(84) h = Ωη Q(Ωδ + Q Ωη Q)−1 Q y . Putting this into the equation
(85) x = E (ξ g ) = y − E (η g ) = y − h gives
(86) x = y − Ωη Q(Ωδ + Q Ωη Q)−1 Q y . The LeastSquares Derivation of the Filter
As in the case of the extraction of a signal from a stationary process, the
estimate of the trended vector ξ can also be derived according to a leastsquares
criterion. The criterion is
(87) Minimise (y − ξ ) Ω−1 (y − ξ ) + ξ QΩ−1 Q ξ .
η
δ The ﬁrst term in this expression penalises the departures of the resulting curve
from the data, whereas the second term imposes a penalty for a lack of smoothness. Diﬀerentiating the function with respect to ξ and setting the result to
zero gives
(88) Ω−1 (y − x) = −QΩ−1 Q x = QΩ−1 d,
η
δ
δ where x stands for the estimated value of ξ and where d = Q x. Premultiplying
by Q Ωη gives
(89) Q (y − x) = Q y − d = Q Ωη QΩ−1 d,
δ whence
(90) Q y = d + Q Ωη QΩ−1 d
δ
= (Ωδ + Q Ωη Q)Ω−1 d,
δ which gives
(91) Ω−1 d = (Ωδ + Q Ωη Q)−1 Q y .
δ Putting this into
(92) x = y − Ωη QΩ−1 d,
δ
25 D.S.G. POLLOCK: Filtering Macroeconomic Data
which comes from premultiplying (88) by Ωη , gives
x = y − Ωη Q(Ωδ + Q Ωη Q)−1 Q y . (93) which is equation (86) again.
One should observe that
(94) Ωη Q(Ωδ + Q Ωη Q)−1 Q y = Ωη Q(Ωδ + Q Ωη Q)−1 Q e, where e = Q(Q Q)−1 Q y is the vector of residuals obtained by interpolating a
straight line through the data by a leastsquares regression. That is to say, it
makes no diﬀerence to the estimate of the component that is complementary
to the trend whether the ﬁlter is applied to the data vector y or the residual
vector e. If the trendestimation ﬁlter is applied to e instead of to y , then the
resulting vector can be added to the ordinates of the interpolated line to create
the estimate of the trend.
The Leser (H–P) Filter and the Butterworth Filter
The speciﬁc cases that have been considered in the context of the classical
form of the Wiener–Kolmogorov ﬁlter can now be adapted to the circumstances
of short trended sequences. First, there is the Leser or H–P ﬁlter. This is
derived by setting
(95) 2
D(η ) = Ωη = ση I, 2
D(δ ) = Ωδ = σδ I and λ = 2
ση
2
σδ within (93) to give
x = y − Q(λ−1 I + Q Q)−1 Q y (96) Here, λ is the socalled smoothing parameter. It will be observed that, as
λ → ∞, the vector x tends to that of a linear function interpolated into the
data by leastsquares regression, which is represented by equation (78). The
matrix expression Ψ = I − Q(λ−1 I + Q Q)−1 Q for the ﬁlter can be compared to
the polynomial expression ψ c (z ) = 1 − ψ (z ) of the classical formulation, which
entails the z transform from (45).
The Butterworth ﬁlter that is appropriate to short trended sequences can
be represented by the equation
x = y − λΣQ(M + λQ ΣQ)−1 Q y . (97)
Here, the matrices
(98) Σ = {2IT − (LT + LT )}n−2 and M = {2IT + (LT + LT )}n are obtained from the RHS of the equations {(1 − z )(1 − z −1 )}n−2 = {2 − (z +
z −1 )}n−2 and {(1 + z )(1 + z −1 )}n = {2 + (z + z −1 )}n , respectively, by replacing
26 D.S.G. POLLOCK: Filtering Macroeconomic Data 0.15
0.1
0.05
0
−0.05
−0.1
0 50 100 150 Figure 11. The residual sequence from ﬁtting a linear trend to the logarithmic
consumption data with an interpolated line representing the business cycle, obtained
by the frequencydomain method. z by LT and z −1 by LT . Observe that the equalities no longer hold after the
replacements. However, it can be veriﬁed that
(99) Q ΣQ = {2IT − (LT + LT )}n . Filtering in the Frequency Domain
The method of Wiener–Kolmogorov ﬁltering can also be implemented using the circulant dispersion matrices that are given by
(100) ¯
Ω◦ = U γξ (D)U,
ξ ¯
Ω◦ = U γη (D)U
η and ¯
Ω◦ = Ω◦ + Ω◦ = U {γξ (D) + γη (D)}U,
ξ
η wherein the diagonal matrices γξ (D) and γη (D) contain the ordinates of the
spectral density functions of the component processes. Accounts of the algebra
of circulant matrices have been provided by Pollock (1999 and 2002). See, also,
Gray (2002).
Here, U = T −1/2 [W jt ], wherein t, j = 0, . . . , T − 1, is the matrix of the
Fourier transform, of which the generic element in the j th row and tth column
¯
is W jt = exp(−i2πtj/T ), and U = T 1/2 [W −jt ] is its conjugate transpose.
2
Also, D = diag{1, W, W , . . . , W T −1 }, which replaces z within each of the
autocovariance generating functions, is a diagonal matrix whose elements are
the T roots of unity, which are found on the circumference of the unit circle in
the complex plane.
By replacing the dispersion matrices within (55) and (56) by their circulant
counterparts, we derive the following formulae:
(101) ¯
x = U γξ (D){γξ (D) + γη (D)}−1 U y = Pξ y, (102) ¯
h = U γη (D){γξ (D) + γη (D)}−1 U y = Pη y.
27 D.S.G. POLLOCK: Filtering Macroeconomic Data
Similar replacements within the formulae (57) and (58) provide the expressions
for the error dispersion matrices that are appropriate to the circular ﬁlters.
The ﬁltering formulae may be implemented in the following way. First, a
Fourier transform is applied to the data vector y to give U y , which resides in the
frequency domain. Then, the elements of the transformed vector are multiplied
by those of the diagonal weighting matrices Jξ = γξ (D){γξ (D) + γη (D)}−1 and
Jη = γη (D){γξ (D) + γη (D)}−1 . Finally, the products are carried back into
the time domain by the inverse Fourier transform, which is represented by the
¯
matrix U .
An example of the method of frequency ﬁltering is provided by Figure 11,
which shows the eﬀect applying a ﬁlter with a sharp cutoﬀ at the frequency
value of π/8 radians per period to the residual sequence obtained from a linear
detrending of the quarterly logarithmic consumption data of the U.K.
This cutoﬀ frequency has been chosen in reference to the periodogram
of the residual sequence, which is in Figure 10. This shows that the lowfrequency structure of the data falls in the interval [0, π/8]. Apart from the
prominent spike at the season frequency of π/2 and the smaller seasonal spike
at the frequency of π , the remainder of the periodogram is characterised by
wide spectral deadspaces.
The ﬁlters described above are appropriate only to stationary processes.
However, they can be adapted in several alternative ways to cater to nonstationary processes. One way is to reduce the data to stationarity by twofold
diﬀerencing before ﬁltering it. After ﬁltering, the data may be reinﬂated by a
process of summation.
As before, let the original data be denoted by y = ξ + η and let the
diﬀerenced data be g = Q y = δ + κ. If the estimates of δ = Q ξ and κ = Q η
are denoted by d and k respectively, then the estimates of ξ and η will be
(103) x = S∗ d∗ + Sd d∗ = (S∗ S∗ )−1 S∗ (y − Sd) where and
(104) h = S∗ k∗ + Sk where k∗ = −(S∗ S∗ )−1 S∗ Sk. Here, d∗ an k∗ are the initial conditions that are obtained via the minimisation
of the function
(105) (y − x) (y − x) = (y − S∗ d∗ − Sd) (y − S∗ d∗ − Sd)
= (S∗ k∗ + Sk ) (S∗ k∗ + Sk ) = h h. The minimisation ensures that the estimated trend x adheres as closely as
possible to the data y .
In the case where the data are diﬀerenced twice, there is
(106) S∗ = 1
0 2
1 ...
...
28 T −1
T −2 T
T −1 D.S.G. POLLOCK: Filtering Macroeconomic Data
The elements of the matrix S∗ S∗ can be found via the formulae
T t2 =
t=1 (107) T t(t − 1) =
t=1 1
T (T + 1)(2T + 1)
6 and 1
1
T (T + 1)(2T + 1) − T (T + 1).
6
2 A compendium of such results has been provided by Jolly (1961), and proofs
of the present results were given by Hall and Knight (1899).
A fuller account of the implementation of the frequency ﬁlter has been
provided by Pollock (2009).
Example. Before applying a frequencydomain ﬁlter, it is necessary to ensure
that the data are free of trend. If a trend is detected, then it may be removed
from the data by subtracting an interpolated polynomial trend function. A test
for the presence of a trend is required that diﬀers from the tests that are used
to detect the presence of unit roots in the processes generating the data. This
is provided by the signiﬁcance test associated with the ordinaryleast squares
estimate of a linear trend.
There is a simple means of calculating the adjusted sum of squares of the
temporal index t = 0, 1, . . . , T − 1, which is entailed in the calculation of the
slope coeﬃcient
(108) 2
yt − (
t2 − ( b= yt )2 /T
.
t)2 /T The formulae
T −1 1
t = (T − 1)T (2T − 1)
6
2 (109)
t=0 T −1 and t=
t=0 T (T − 1)
2 are combined to provide a convenient means of calculating the denominator of
the formula of (108):
T −1 t−
2 (110)
t=0 ( T −1
t=0 T t)2 = (T − 1)T (T + 1)
.
12 Another means of calculating the lowfrequency trajectory of the data via
the frequency domain mimics the method of equation (93) by concentrating
of the estimation the highfrequency component. This can be subtracted from
the data to create an estimate of the complementary lowfrequency trend component. However, whereas, in the case of equation (93), the diﬀerencing of
the data and the reinﬂation of the estimated highfrequency component are
29 D.S.G. POLLOCK: Filtering Macroeconomic Data
deemed to take place in the time domain, now the reinﬂation occurs in the frequency domain before the resulting vector of Fourier coeﬃcients is transformed
to the time domain.
The reduction of a trended data sequence to stationarity continues to be
eﬀected by the matrix Q but, in this case, the matrix can be seen in the context
of a centralised diﬀerence operator This is
(111) N (z ) = z −1 − 2 + z = z −1 (1 − z )2
= z −1 ∇2 (z ). The matrix version of the operator is obtained by setting z = LT and z −1 = LT ,
which gives
(112) N (LT ) = NT = LT − 2IT + LT . The ﬁrst and the ﬁnal rows of this matrix do not deliver true diﬀerences. Therefore, they are liable to be deleted, with the eﬀect that the two end points are
lost from the twicediﬀerenced data. Deleting the rows e0 NT and eT −1 NT from
NT gives the matrix Q , which can also be obtained from ∇2 = (IT − LT )2 by
T
deleting the matrix Q∗ , which comprises the ﬁrst two rows e0 ∇2 and e1 ∇2 . In
T
T
the case of T = 5 there is
⎤
⎡
−2 1
0
0
0
⎡
⎤⎢
⎥
Q−1
0
0⎥
⎢ 1 −2 1
⎥
⎢
1 −2 1
0 ⎥.
(113)
N5 = ⎣ Q ⎦ = ⎢ 0
⎥
⎢
Q+1
0
1 −2 1 ⎦
⎣0
0 0 0 1 −2 On deleting the ﬁrst and last elements of the vector NT y , which are Q−1 y =
e1 ∇2 y and Q+1 y , respectively, we get Q y = [q1 , . . . , qT −2 ] .
T
The loss of the two elements from either end of the (centrally) twicediﬀerenced data can be overcome by supplementing the original data vector y
with two extrapolated end points y−1 and yT . Alternatively, the diﬀerenced
data may be supplemented by attributing appropriate values to q0 and qT −1 .
These could be zeros or some combination of the adjacent values. In either
case, we will obtain a vector of order T denoted by q = [q0 , q1 , . . . qT −1 ] .
In describing the method for implementing a highpass ﬁlter, let Λ be the
matrix that selects the appropriate ordinates of the Fourier transform γ = U q
of the twice diﬀerenced data. These ordinates must be reinﬂated to compensate
for the diﬀerencing operation, which has the frequency response
(114) f (ω ) = 2 − 2 cos(ω ). The response of the antidiﬀerencing operation is 1/f (ω ); and γ is reinﬂated
by premultiplying by the diagonal matrix
(115) V = diag{v0 , v1 , . . . , vT −1 },
30 D.S.G. POLLOCK: Filtering Macroeconomic Data
comprising the values vj = 1/f (ωj ); j = 0, . . . , T − 1, where ωj = 2πj/T .
Let H = V Λ be the matrix that is is applied to γ = U q to generate the
Fourier ordinates of the ﬁltered vector. The resulting vector is transformed to
the time domain to give
(116) ¯
¯
h = U Hγ = U HU q. It will be seen that f (ω ) is zerovalued when ω = 0 and that 1/f (ω ) is
unbounded in the neighbourhood of ω = 0. Therefore, a frequencydomain
reinﬂation is available only when there are no nonzero Fourier ordinates in this
neighbourhood. That is to say, it can work only in conjunction with highpass or
bandpass ﬁltering. However, it is straightforward to construct a lowpass ﬁlter
that complements the highpass ﬁlter. The lowfrequency trend component that
is complementary to h is
(117) ¯
x = y − h = y − U HU q. Business Cycles and Spurious Cycles
Econometricians continue to debate the question of how macroeconomic
data sequences should be decomposed into their constituent components. These
components are usually described as the trend, the cyclical component or the
business cycle, the seasonal component and the irregular component.
For the original data, the decomposition is usually a multiplicative one
and, for the logarithmic data, the corresponding decomposition is an additive
one. The ﬁlters are usually applied to the logarithmic data, in which case, the
sum of the estimated components should equal the logarithmic data.
In the case of the Wiener–Kolmogorov ﬁlters, and of the frequencydomain
ﬁlters as well, the ﬁlter gain never exceeds unity. Therefore, every lowpass ﬁlter
ψ (z ) is accompanied by a complementary highpass ﬁlter ψ c (z ) = 1 − ψ (z ). The
two sequences resulting from these ﬁlters can be recombined to create the data
sequence from which they have originated.
Such ﬁlters can be applied sequentially to create an additive decomposition
of the data. First, the tend is extracted. Then, the cyclical component is
extracted from the detrended data, Finally, the residue can be decomposed
into the seasonal and the irregular components.
Within this context, the manner in which any component is deﬁned and
how it is extracted are liable to aﬀect the deﬁnitions of all of the other components. In particular, variations in the deﬁnition of the trend will have substantial eﬀects upon the representation of the business cycle.
It has been the contention of several authors, including Harvey and Jaeger
(1993) and Cogley and Nason (1995), that the eﬀect of using the Hodrick–
Prescott ﬁlter to extract a trend from the data is to create or induce spurious
cycles in the complementary component, which includes the cyclical component.
Others have declared that such an outcome is impossible. They point to
the fact that, since their gains never exceeds unity, the ﬁlters cannot introduce
31 D.S.G. POLLOCK: Filtering Macroeconomic Data 1.25 A 1 B 0.75
0.5
0.25 C 0
0 π/4 π/2 3π/4 π Figure 12. The pseudospectrum of a random walk, labelled A, together with the
squared gain of the highpass Hodrick–Prescott ﬁlter with a smoothing parameter of
λ = 100, labelled B . The curve labelled C represents the spectrum of the ﬁltered
process. anything into the data, nor can they amplify anything that is already present.
On this basis, it can be fairly asserted that, at least, the verbs to create and
to induce have been missapplied, and that the use of the adjective spurious is
doubtful.
The analyses of Harvey and Jaeger and of Cogley and Nason have both
depicted the eﬀects of applying the Hodrick–Prescott ﬁlter to a theoretical
random walk that is supported on a doublyinﬁnite set of integers. They show
that the spectral density function of the ﬁltered process possesses a peak in the
lowfrequency region that is based on a broad range of frequencies. This seems
to suggest that there is cyclicality in the processed data, whereas the original
random walk has no central tendency.
This analysis is illustrated in Figure 12. The curve labelled A is the pseudo
spectrum of a ﬁrstorder random walk. The curve labelled B is the squared
modulus of the frequency response of the highpass, detrending, ﬁlter with a
smoothing parameter of 100. The curve labelled C is the spectral density
function of a detrended sequence which, in theory, would be derived by applying
the ﬁlter to the random walk.
The fault of the Hodrick–Prescott ﬁlter may be that it allows elements of
the data at certain frequencies to be transmitted when, ideally, they should be
blocked. However, it seems that an analysis based on a doublyinﬁnite random
walk is of doubtful validity.
The eﬀects that are depicted in Figure 12 are due largely to the unbounded
nature of the pseudo spectrum labelled A, and, as we have already declared,
there is a zero probability that, at any given time, the value generated by the
random walk will fall within a ﬁnite distance of the horizontal axis.
An alternative analysis of the ﬁlter can be achieved by examining the
eﬀects of its ﬁnitesample version upon a ﬁnite and bounded sequence that has
32 D.S.G. POLLOCK: Filtering Macroeconomic Data 0.15
0.1
0.05
0
−0.05
−0.1
0 50 100 150 Figure 13. The residual sequence obtained by extracting a linear trend from the
logarithmic consumption data, together with a lowfrequency trajectory that has been
obtained via the lowpass Hodrick–Prescott ﬁlter. 11.5
11
10.5
10
0 50 100 150 Figure 14. the quarterly logarithmic consumption data together with a trend interpolated by the lowpass Hodrick–Prescott ﬁlter with the smoothing parameter set to
λ = 1, 600. 0.08
0.06
0.04
0.02
0
−0.02
−0.04
−0.06
−0.08
0 50 100 150 Figure 15. The residual sequence obtained by using the Hodrick–Prescott ﬁlter to
extract the trend, together with a ﬂuctuating component obtained by subjecting the
sequence to a lowpass frequencydomain ﬁlter with a cutoﬀ point at π/8 radians. 33 D.S.G. POLLOCK: Filtering Macroeconomic Data
been detrended by the interpolation of a linear regression function, according
to the ordinary leastsquares criterion.
If y is the vector of the data and if PQ = Q(Q Q)−1 Q , where Q is the
secondorder diﬀerence operator, then the vector of the ordinates of the linear
regression is (I − PQ )y , and the detrended vector is the residual vector e = PQ y .
The highpass Hodrick–Prescott ﬁlter ΨH = Q(λ−1 I + Q Q)−1 Q will generate
the same output from the linearly detrended data as from the original data.
Thus, it follows from (94) that ΨH y = ΨH e.
In characterising the eﬀects of the ﬁlter, it is reasonable to compare the
linearly detrended data e = PQ y with the output ΨH y of the ﬁlter. In the case
of the logarithmic consumption data, these sequences are represented by the
jagged lines that are, respectively, the backdrops to Figures 13 and 15.
Superimposed upon the residual sequence e = PQ y of Figure 13 is the
lowfrequency trajectory (I − ΨH )PQ y = (I − ΨH )e that has been obtained
by subjecting e to the lowpass Hodrick–Prescott Filter with a smoothing parameter of 1,600.
Figure 14 shows the quarterly logarithmic consumption data together with
a trend x = (I − ΨH )y interpolated by the lowpass Hodrick–Prescott ﬁlter. This
trend can be obtained by adding the smooth trajectory of (I − ΨH )e of Figure
13 to the linear trend (I − PQ )y . That is to say,
(I − ΨH )y = (I − ΨH ){PQ y + (I − PQ )y }
= (I − ΨH )e + (I − PQ )y, (118) which follows since (I − ΨH )(I − PQ ) = (I − PQ ). (An implication of this
identity is that a linear trend will be preserved by the lowpass H–P ﬁlter.)
Superimposed upon the jagged sequence ΨH e of Figure 15 is the smoothed
sequence Ψ◦ ΨH e, where Ψ◦ is the lowpass frequencydomain ﬁlter with a cutoﬀ
ξ
ξ
at π/8 radians, which is the value that has been determined from the inspection
of the periodogram of Figure 10.
Now, a comparison can be made of the smooth trajectory Ψ◦ e = Ψ◦ PQ y
ξ
ξ
of Figure 11, which has been determined via linear detrending, and which
has been regarded as an appropriate representation of the business cycle, with
the trajectory x◦ = Ψ◦ ΨH y of Figure 15, which has been determined via the
ξ
Hodrick–Prescott ﬁlter.
Whereas the same essential ﬂuctuations are present in both trajectories,
it is apparent that the more ﬂexible detrending of the Hodrick–Prescott ﬁlter
has served to reduce and to regularise their amplitudes. Thus, some proportion
of the ﬂuctutions, which ought to be present in the trajectory of the business
cycle, has been transferred into the trend.
Thus, although it cannot be be said that the Hodrick–Prescott ﬁlter induces spurious ﬂuctuations in the ﬁltered sequence, it is true that it enhances
the regularity of some the ﬂuctuations that are present in the data. However
the same can be said, without exception, of any frequency selective ﬁlter.
A prescription for the estimating the trend is that it should it be maximally
stiﬀ, unless it is required to accommodate a structural break. The trend is to
34 D.S.G. POLLOCK: Filtering Macroeconomic Data 13
12
11
10
1880 1900 1920 1940 1960 1980 2000 Figure 16. The logarithms of annual U.K. real GDP from 1873 to 2001 with an
interpolated trend. The trend is estimated via a ﬁlter with a variable smoothing
parameter. be regarded as a benchmark with which to measure the cyclical ﬂuctuations.
In times of normal economic activity, a log linear trend, which represents a
trajectory of constant exponential growth, may be appropriate. At other times
the trend should be allowed to adapt to reﬂect untoward events.
A device that achieves this is available in the form of a version of the H–P
ﬁlter that has a smoothing parameter that is variable over the sample. When
the trajectory of the trend is required to accommodate a structural break, the
smoothing parameter λ can be set to a value close to zero within the appropriate
locality. Elsewhere, it can be given a high value to ensure that a stiﬀ curve
is created. Such a ﬁlter is available in the IDEOLOG computer program, of
which the web address will be given at the end of the chapter.
Figure 16 shown an example of the use of this ﬁlter. There were brief
disruptions to the steady upwards progress of GDP in the U.K. after the two
world wars. These breaks have been absorbed into the trend by reducing the
value of the smoothing parameter in their localities, which are highlighted in
the ﬁgure. By contrast, the break that is evident in the data following the year
1929 has not been accommodated in the trend.
Seasonal Adjustment in the Time Domain
The seasonal adjustment of economic data is performed preponderantly by
central statistical agencies. The prevalent methods continue to be those that
were developed by the U.S. Bureau of Census and which are encapsulated in
the X11 computer program and its derivatives X11ARIMA and X12. The
X11 program was the culmination of the pioneering work of Julius Shiskin in
the 1960’s. (See Shiskin et. al. 1967.)
The X11 program, which is diﬃcult to describe concisely, depends on the
successive application of the timehonoured Henderson movingaverage ﬁlters
that have proved to be very eﬀective in practice but which lack a ﬁrm foundation
35 D.S.G. POLLOCK: Filtering Macroeconomic Data
in the modern theory of ﬁltering. An extensive description of the program has
been provided by Ladiry and Quenneville (2001).
Recently, some alternative methods of seasonal adjustment have been making headway amongst central statistical agencies. Foremost amongst these is
the ARIMAmodelbased method of the TRAMO–SEATS package. Within this
program, the TRAMO (Time Series Regression with ARIMA Noise, Missing
Observations and Outliers) module estimates a model of the composite process.
Thereafter, the estimated parameters are taken to be the true parameter of the
process, and they are passed to the SEATS (Signal Extraction in ARIMA Time
Series) module, which extracts the components of the data.
The program employs the airline passenger model of Box and Jenkins
(1976) as its default model. This is represented by the equation
(119) y (z ) = N (z )
ε(z ) =
P (z ) (1 − ρz )(1 − θz s )
(1 − z )(1 − z s ) ε(z ), where N (z ) and P (z ) are polynomial operators and y (z ) and ε(z ) are, respectively, the z transforms of the output sequence y (t) = {yt ; t = 0, ±1, ±2, . . .}
and of the input sequence ε(t) = {εt ; t = 0, ±1, ±2, . . .} of unobservable whitenoise disturbances. The integer s stands for the number of periods in the year,
which are s = 4 for quarterly data and s = 12 for monthly data. Without loss
of generality as far as the derivation of the ﬁlters is concerned, the variance of
the input sequence can be set to unity.
Given the identity 1 − z s = (1 − z )Σ(z ), where Σ(z ) = 1 + z + · · · + z s−1
is the seasonal summation operator, it follows that
(120) P (z ) = (1 − z )(1 − z s ) = ∇2 (z )Σ(z ), where ∇(z ) = 1 − z is the backward diﬀerence operator. The polynomial Σ(z )
has zeros at the points exp{i(2π/s)j }; j = 1, 2, . . . , s − 1, which are located on
the circumference of the unit circle in the complex plane at angles from the
horizontal that correspond to the fundamental seasonal frequency ωs = 2π/s
and its harmonics.
The TRAMO–SEATS program eﬀects a decomposition of the data into a
seasonal component and a nonseasonal component that are described by statistically independent processes driven by separate whitenoise forcing functions.
It espouses the principal of canonical decompositions that has been expounded
by Hillmer and Tiao (1982).
The ﬁrst step in this decomposition entails the following partialfraction
decomposition of the generating function of the autocovariances of y (t):
(121) U ∗ (z −1 )U ∗ (z ) V ∗ (z −1 )V ∗ (z )
N (z −1 )N (z )
= 2 −1 2
+
+ ρθ.
P (z −1 )P (z )
∇ (z )∇ (z )
Σ(z −1 )Σ(z ) Here, ρθ is the quotient of the division of N (z −1 )N (z ) by P (z −1 )P (z ), which
must occur before the remainder, which will be a proper fraction, can be decomposed.
36 D.S.G. POLLOCK: Filtering Macroeconomic Data
In the preliminary decomposition of (121), the ﬁrst term on the RHS corresponds to the trend component, the second term corresponds to the seasonal
component and the third term corresponds to the irregular component. Hillmer
and Tiao have provided expressions for the numerators of the RHS, which are
somewhat complicated, albeit that the numerators can also be found by numerical means.
When z = eiω , equation (121) provides the spectral ordinates of the process
and of its components at the frequency value of ω . The corresponding spectral
density functions are obtained by letting ω run from 0 to π . The quotient ρθ
corresponds to the spectrum a whitenoise process, which is constant over the
frequency range.
The principal of canonical decomposition proposes that the estimates of the
trend and of the seasonal component should be devoid of any elements of white
noise. Therefore, their spectra must be zerovalued at some point in the interval
[0, π ]. Let qT and qS be the minima of the spectral density functions associated
with the trend and the seasonal components respectively. By subtracting these
values from their respective components, a revised decomposition is obtained
that fulﬁls the canonical principal. This is
(122) U (z −1 )U (z )
V (z −1 )V (z )
N (z −1 )N (z )
= 2 −1 2
+
+ q,
P (z −1 )P (z )
∇ (z )∇ (z )
Σ(z −1 )Σ(z ) where q = ρθ + qT + qS .
The Wiener–Kolmogorov principle of signal extraction indicates that the
ﬁlter that serves to extract the trend from the data sequence y (t) should take
the form of
U (z −1 )U (z )
P (z −1 )P (z )
×
∇2 (z −1 )∇2 (z ) N (z −1 )N (z )
U (z −1 )U (z )
× Σ(z −1 )Σ(z ).
=
−1 )N (z )
N (z βT (z ) =
(123) This is the ratio of the autocovariance generating function of the trend component to that of the process as a whole. This ﬁlter nulliﬁes the seasonal
component in the process of extracting a trend that is relatively free of highfrequency elements. The nulliﬁcation of the seasonal component is due to the
factor Σ(z ).
The squared gain of the ﬁlter that serves to extract the trend from the
quarterly logarithmic consumption data of Figure 9 is shown in Figure 17.
This ﬁlter is derived from a model of the data based on equation (120), where
s = 4 and where ρ = 0.1698 and θ = 0.6248 are estimated parameters that
determine the polynomial N (z ). The estimated trend is shown in Figure 17.
The ﬁlter that serves to extract the seasonal component from the data is
constructed on the same principal as the trend extraction ﬁlter. It takes the
form of
(124) βS (z ) = V (z −1 )V (z )
× ∇2 (z −1 )∇2 (z ).
N (z −1 )N (z )
37 D.S.G. POLLOCK: Filtering Macroeconomic Data 1
0.75
0.5
0.25
0
0 π/4 π/2 3π/4 π Figure 17. The squared gain of the ﬁlter for extracting the trend from the logarithmic consumption data. The ﬁlter that serves the purposes of seasonal adjustment, and which nulliﬁes the seasonal component without further attenuating the highfrequency
elements of the data, is
(125) βA (z ) = 1 − βS (z ). The squared gain of the seasonal adjustment ﬁlter that is derived from the
model of the logarithmic consumption data is in shown in Figure 19 and the
seasonal component that is extracted from the data is shown in Figure 20.
Various procedures are available for eﬀecting the canonical decomposition
of the data. The method that is followed by the SEATS program is one that was
expounded in a paper of Burman (1980), which depends on a partial fraction
decomposition of the ﬁlter itself. The decomposition of the generic ﬁlter takes
the form of
(126) C (z )
D(z )
D(z −1 )
β (z ) =
=
+
.
N (z )N (z −1 )
N (z ) N (z −1 ) Compared with the previous approaches associated with the timedomain ﬁlters, this a matter of implementing the ﬁlter via components that are joined in
parallel rather than in series.
The estimate of the seasonal component obtained by Burman’s method is
therefore
(127) x(z ) = f (z ) + b(z ) = D(z )
D(z −1 )
y (z ) +
y (z ).
N (z )
N (z −1 ) Thus, a component f (t) is obtained by running forwards through the data, and
a component b(t) is obtained by running backwards through the data.
38 D.S.G. POLLOCK: Filtering Macroeconomic Data 11.5
11
10.5
10
0 50 100 150 Figure 18. The logarithmic consumption data overlaid by the estimated trendcycle
component. The plot of the seasonallyadjusted data, which should adhere closely to
the trendcycle trajectory, has been displaced downwards. 1
0.75
0.5
0.25
0
0 π/4 π/2 3π/4 π Figure 19. The squared gain of the seasonal adjustment ﬁlter derived from a model
of the logarithmic consumption data. 0.06
0.04
0.02
0
−0.02
−0.04
−0.06
0 50 100 150 Figure 20. The component that is removed by the seasonal adjustment ﬁlter. 39 D.S.G. POLLOCK: Filtering Macroeconomic Data
In order to compute either of these components, one needs some initial
conditions. Consider the recursion running backwards through the data, which
is associated with the equation
(128) N (z −1 )b(z ) = D(z −1 )y (z ). This requires some starting values for both b(t) and y (t). The SEATS program
obtains these values by stepping outside the sample.
The postsample values of y (t) are generated in the usual way using a
recursion based upon the equation of the ARIMA model, which is
(129) ψ (L)y (t) = N (L)ε(t). Here, the requisite postsample elements of ε(t) are represented by their zerovalued expectations. The postsample values of b(t) are calculated by a clever
algorithm which was proposed to Burman by Granville Tunnicliﬀe–Wilson.
(Tunnicliﬀe–Wilson was responsible for writing the programs that accompanied the original edition of the book of Box and Jenkins (1976); and he has
played a major role in the development of the computational algorithms of
modern timeseries analysis.) The Burman–Wilson algorithm is expounded in
the appendix to Burman’s paper.
To initiate the recursion which generates the sequence f (t), some presample values are found by a method analogous to the one that ﬁnds the postsample
values.
Seasonal Adjustment in the Frequency Domain
The TRAMO–SEATS program generates an abundance of diagrams relating to the spectra or pseudospectra of the component models and to the
frequency responses of the associated ﬁlters. These diagrams are amongst the
end products of the analysis. However, there is no frequency analysis of the
data to guide the speciﬁcation of the ﬁlters. Instead, they are determined by
the component models that are derived from the aggregate ARIMA model that
describes the data.
In this section, we shall pursue a method of seasonal adjustment that
begins by looking at the periodogram of the detrended data. The detrending
is by means of a polynomial regression. The residual sequence from the linear
detrending of the logarithmic consumption data is shown in Figure 11 and the
corresponding periodogram is shown in Figures 10 and 21.
Figure 22 shows that the signiﬁcant elements of the data fall within three
highlighted bands. The ﬁrst band, which covers the frequency interval [0, π/8],
comprises the elements that constitute the lowfrequency business cycle that
is represented by the heavy line in Figure 11. When the cycle is added to
the linear trend that is represented in Figure 9, the result is the trend–cycle
component that is shown in Figure 21.
The second highlighted band, which covers the interval [π/2 − 4π/T, π/2 −
4π/T ], comprises ﬁve elements, which include two on either side of the seasonal
40 D.S.G. POLLOCK: Filtering Macroeconomic Data 11.5
11
10.5
10
0 50 100 150 Figure 21. The trendcycle component derived by adding the interpolated polynomial to the lowfrequency components of the residual sequence 0.01
0.0075
0.005
0.0025
0
0 π/4 π/2 3π/4 π Figure 22. The periodogram of the residual sequence obtained from the linear
detrending of the logarithmic consumption data. The shaded bands in the vicinities
of π/2 and π contain the elements of the seasonal component. 0.06
0.04
0.02
0
−0.02
−0.04
−0.06
0 50 100 150 Figure 23. The seasonal component, synthesised from Fourier ordinates in the
vicinities of the seasonal frequency and its harmonic. 41 D.S.G. POLLOCK: Filtering Macroeconomic Data
frequency of π/2. The third band, which covers the interval [π − 6π/T, π ],
contains the harmonic of the seasonal frequency and three elements at adjacent
frequencies. The seasonal component, which is synthesised from the elements
in the second and third bands, is represented in Figure 23.
In addition to showing the logarithmic data sequence and the interpolated trend–cycle component, Figure 21 also shows a version of the seasonallyadjusted data. This is represented by the line that has been displaced downwards. It has been derived by subtracting the seasonal component from the
data.
A comparison of Figure 17–20, which relate to the ARIMAmodelbased
ﬁlters, with Figures 21–23, which relate to the frequencydomain ﬁlters, shows
that, notwithstanding the marked diﬀerences in the alternative methodologies
of ﬁltering, the results are virtually indistinguishable. This is a fortuitous
circumstance that is largely attributable to the nature of the data, which is
revealed by the periodogram.
On the strength of what is revealed by Figure 22, it can be asserted that an
ARIMA model misrepresents the data. The components of the detrended data
are conﬁned to bands that are separated by wide dead spaces in which there are
no elements of any signiﬁcant amplitudes. In contrast, the data generated by an
ARIMA process is bound to extend, without breaks, over the entire frequency
interval [0, π ], and there will be no dead spaces.
The nature of an ARIMA process is reﬂected in the gain of the trendextraction ﬁlter of the TRAMO–SEATS program, which is represented by Figure 17. The ﬁlter allows the estimated trend to contain elements at all frequencies, albeit that those at the highest frequencies are strongly attenuated. This
accords with the model of the trend, which is random walk.
Disregarding the seasonal component, there are no highfrequency elements
in the data, nor any beyond the frequency limit of π/8. Therefore, there is
no consequence in allowing such elements to pass through the ﬁlter; and its
eﬀects are virtually the same as those of the corresponding frequencydomian
ﬁlter. If there were anything in the data beyond the limit that had not been
removed by the seasonal adjustment, then the eﬀect of the ﬁlter would to be
produce a trend–cycle component with a proﬁle roughened by the inclusion
of highfrequency noise. It would resemble a slightly smoother version of the
seasonallyadjusted data sequence.
The Programs
The programs that have been described in this chapter are freely available
from various sources. The H–P (Leser) ﬁlter and the Butterworth ﬁlter have
been implemented in the program IDEOLOG, as have the frequencydomain
ﬁlters. The program is available at the address
http://www.le.ac.uk/users/dsgp1/
The H–P and Butterworth ﬁlters are also available in the gretl (Gnu Regression,
Econometrics and Timeseries Library) program, which can be downloaded
42 D.S.G. POLLOCK: Filtering Macroeconomic Data
from the address
http://gretl.sourceforge.net/
The TRAMO–SEATS program which implements the ARIMAmodelbased ﬁlters is available from the Bank of Spain at the address
http://www.bde.es/webbde/en/secciones/servicio/software/programas.html
The program, which is freestanding, can also be hosted by gretl.
References
Box, G.E.P., and G.M. Jenkins, (1976), Time Series Analysis: Forecasting and
Control, Revised Edition, Holden Day, San Francisco.
Burman, J.P., (1980), Seasonal Adjustment by Signal Extraction, Journal of
the Royal Statistical Society, Series A, 143, 321–337.
Caporello G., and A. Maravall, (2004), Program TSW, Revised Reference Manual, Servicio de Estudios, Banco de Espa˜a.
n
Cogley, T., and J.M. Nason, (1995), Eﬀects of the Hodrick–Prescott Filter on
Trend and Diﬀerence Stationary Time Series, Implications for Business Cycle
Research, Journal of Economic Dynamics and Control, 19, 253–278.
Gray, R.M., (2002), Toeplitz and Circulant Matrices: A Review, Information
Systems Laboratory, Department of Electrical Engineering, Stanford University, California, http://ee.stanford.edu/gray/~toeplitz.pdf.
Hall, H.S., and S.R. Knight, (1899), Higher Algebra, Macmillan and Co., London.
Harvey, A.C., and A. Jaeger, (1993), Detrending, Stylised Facts and the Business Cycle, Journal of Applied Econometrics, 8, 231–247.
Hodrick, R.J., and E.C. Prescott, (1980), Postwar U.S. Business Cycles: An
Empirical Investigation, Working Paper, Carnegie–Mellon University, Pittsburgh, Pennsylvania.
Hillmer, S.C., and G.C. Tiao, (1982), An ARIMAModelBased Approach to
Seasonal Adjustment, Journal of the American Statistical Association, 77, 63–
70.
Hodrick R.J., and E.C. Prescott, (1997), Postwar U.S. Business Cycles: An
Empirical Investigation, Journal of Money, Credit and Banking, 29, 1–16.
Jolly, L.B.W., (1961), Summation of Series: Second Revised Edition, Dover
Publications: New York.
Jury, E.I., (1964), Theory and Applications of the zTransform Method, John
Wiley and Sons, New York.
43 D.S.G. POLLOCK: Filtering Macroeconomic Data
Kolmogorov, A.N., (1941), Interpolation and Extrapolation, Bulletin de
l’Academie des Sciences de U.S.S.R., Ser. Math., 5, 3–14.
Ladiray, D., and B. Quenneville, (2001), Seasonal Adjustment with the X11
Method, Springer Lecture Notes in Statistics 158, Springer Verlag, Berlin.
Leser, C.E.V., (1961), A Simple Method of Trend Construction, Journal of the
Royal Statistical Society, Series B, 23, 91–107.
Nyquist, H., (1928), Certain Topics in Telegraph Transmission Theory, AIEE
Transactions, Series B, 617–644.
Pollock, D.S.G., (1999), A Handbook of TimeSeries Analysis, Signal Processing
and Dynamics, Academic Press, London.
Pollock, D.S.G., (2000), Trend Estimation and Detrending via Rational Square
Wave Filters, Journal of Econometrics, 99, 317–334.
Pollock, D.S.G., (2002), Circulant Matrices and TimeSeries Analysis, The International Journal of Mathematical Education in Science and Technology, 33,
213–230.
Pollock, D.S.G., (2009), Realisations of Finitesample Frequencyselective Filters, Journal of Statistical Planning and Inference, 139, 1541–1558.
Shannon, C.E., (1949a), Communication in the Presence of Noise, Proceedings
of the Institute of Radio Engineers, 37, 10–21. Reprinted in 1998, Proceedings
of the IEEE, 86, 447–457.
Shannon, C.E., (1949b), (reprinted 1998), The Mathematical Theory of Communication, University of Illinois Press, Urbana, Illinois.
Shiskin, J., A.H. Young, and J.C. Musgrave, (1967), The X11 Variant of the
Census Method II Seasonal Adjustment, Technical Paper No. 15, Bureau of the
Census, U.S. Department of Commerce.
Whittaker, E.T., (1923), On a New Method of Graduations, Proceedings of the
Edinburgh Mathematical Society, 41, 63–75.
Wiener, N., (1941), Extrapolation, Interpolation and Smoothing of Stationary
Time Series. Report on the Services Research Project DIC6037. Published in
book form in 1949 by MIT Technology Press and John Wiley and Sons, New
York. 44 ...
View
Full
Document
This note was uploaded on 03/02/2012 for the course EC 7087 taught by Professor D.s.g.pollock during the Fall '11 term at Queen Mary, University of London.
 Fall '11
 D.S.G.Pollock

Click to edit the document details