ELGARTEXT - FILTERING MACROECONOMIC DATA By D.S.G. Pollock...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: FILTERING MACROECONOMIC DATA By D.S.G. Pollock University of Leicester Email: stephen pollock@sigmapi.u-net.com This chapter sets forth the theory of linear filtering together with an accompanying frequency-domain analysis. It employs the classical Wiener– Kolmogorov theory in describing some of the filters that are used by econometricians. This theory, which was developed originally in reference to stationary stochastic processes defined on a doubly-infinite index set, is adapted to cater to short nonstationary sequences. An alternative methodology of filtering is also described. This operates in the frequency domain, by altering the amplitudes of the trigonometrical functions that are the elements of the Fourier decomposition of the detrended data. 1. Introduction The purpose of a filter is to remove unwanted components from a stream of data so as to enhance the clarity of the components of interest. In many engineering applications and in some econometric applications, there is a single component of interest, described as the signal, to which a component has been added that can be described as the noise. A complete separation of the signal and the noise is possible only if they reside in separate frequency bands. It they reside in overlapping frequency bands, then their separation is bound to be tentative. The signal typically comprises elements of low frequency and the noise comprises elements of higher frequencies. Filters are, therefore, designed by engineers with reference to their frequency-selective properties. In econometric applications, some additional components must be taken into account. The foremost of these is the trend, which may be defined as an underlying trajectory of the data that cannot be synthesised from trigonometrical functions alone. It is difficult to give a more specific definition, which may account for the wide variety of procedures that have been proposed for extracting trends from the economic data. A business cycle component might also be extracted from the data; but this is often found in combination with the trend. Another component that is commonly present, if it has not been removed already by the providers of the economic data, is a pattern of seasonal fluctuations. In this case, given that the fluctuations reside in limited frequency bands, it is easier to provide a specific definition of the seasonal component, albeit that there is still scope for alternative definitions. Notwithstanding the ill-defined nature of these components, econometricians have tended to adopt particular models for the trend and for the seasonal 1 D.S.G. POLLOCK: Filtering Macroeconomic Data fluctuations. The trend is commonly modelled by a first-order random walk with drift, which is an accumulation of a white-noise sequence of independently and identically distributed random variables. The drift occurs when the variables have a nonzero mean—a positive mean giving rise to an upward drift. Second-order processes involving a double accumulation of white noise are also used to model the trend. Econometricians commonly model the seasonal fluctuations by autoregressive moving-average processes in which the autoregressive operator contains complex roots with moduli of unity and with arguments that correspond to the fundamental seasonal frequency and to the harmonically related frequencies. The moving-average operator is usually instrumental in confining the effects of these roots to the vicinities of the seasonal frequencies. Given a complete statistical specification of the processes generating the data, it is possible to derive the filters that provide the minimum mean-squared error estimates of the various components. This approach has been followed in the TRAMO–SEATS program of Caporello and Maravall (2005) for decomposing an econometric data sequence into it components. An account of their methods will be given in the penultimate section of this chapter. The structural time series methodology that has been incorporated in the STAMP computer package of Koopmans et al. (2000) follows a similar approach. The STAMP program employs the Kalman filter, which is discussed elsewhere in this Handbook in the chapter by Tommaso Proietti. This powerful and all-encompassing method is capable of dealing with non stationary data processes, provided that there are models to describe them. Whereas the model-based approach to filtering has led to some refined computer programs that can often be used automatically to process the data, there are circumstances in which a significant mismatch occurs between the data and the models. Then, some alternative methods must be pursued which can be adapted more readily to the reflect the properties of the data. An aim of this chapter is to describe some methods that meet this requirement. In this chapter, we shall also employ some statistical models of the processes underlying the data. However, these will be heuristic models rather than models that propose to be realistic. Their purpose is to enable us to derive filters that are endowed with whatever are the appropriate frequencyselective capabilities. Thus, the specifications of the resulting filters are to be determined flexibly in the light of the properties of the data. In deriving these filters, we use an extension of the time-honoured Wiener– Kolmogorov principle, which is intended to provide minimum mean-squared error estimates of the components whenever these are truly described by the models. The original of Wiener–Kolmogorov theory was based on the assumption that the data are generated by stationary stochastic processes. Therefore, we have to adapt the theory to cater to non stationary processes. An alternative methodology will also be described that approaches the matter of frequency selection in a direct manner that does not depend on any models of the data. The resulting procedures, which employ what may be 2 D.S.G. POLLOCK: Filtering Macroeconomic Data described as frequency-domain filters, perform the essential operations upon the trigonometrical functions that are the elements of the Fourier decomposition of the detrended data. An advantage of these filters is that they enable one to separate elements that are at adjacent frequencies. Such sharp divisions of the frequency contents of the data cannot be achieved by the time-domain filters, which operate directly on the data, without incurring severe problems of numerical instability. Some mathematical results must be provided in order to support the analysis of filtering. Some of these results will be presented at the outset in the sections that follow this introduction. Other results will be dispersed throughout the text. We shall begin with some basic definitions. Linear Time Invariant Filters Whenever we form a linear combination of successive elements of a discretetime signal x(t) = {xt ; t = 0, ±1, ±2, . . .}, we are performing an operation that is described as linear filtering. In the case of a linear time-invariant filter, such an operation can be represented by the equation (1) ψj x(t − j ). y (t) = j To assist in the algebraic manipulation of such equations, we may convert the data sequences x(t) and y (t) and the sequence of filter coefficients {ψj } into power series or polynomials. By associating z t to each element yt and by summing the sequence, we get yt z t = (2) t ψj xt−j z t t or y (z ) = ψ (z )x(z ), j where (3) xt z t , x(z ) = t yt z t y (z ) = t ψj z j . and ψ (z ) = j The convolution operation of equation (1) becomes an operation of polynomial multiplication in equation (2). We are liable to describe the z -transform ψ (z ) of the filter coefficients as the transfer function of the filter. For a treatise on the z -transform, see Jury (1964). The Impulse Response The sequence {ψj } of the filter’s coefficients constitutes its response, on the output side, to an input in the form of a unit impulse. If the sequence is finite, then ψ (z ) is described as a moving-average filter or as a finite impulse-response (FIR) filter. When the filter produces an impulse response of an indefinite duration, it is called an infinite impulse-response (IIR) filter. The filter is said to be causal or backward-looking if none of its coefficients is associated with a negative power of z . In that case, the filter is available for real-time signal processing. 3 D.S.G. POLLOCK: Filtering Macroeconomic Data Causal Filters A practical filter, which is constructed from a limited number of components of hardware or software, must be capable of being expressed in terms of a finite number of parameters. Therefore, linear IIR filters which are causal will invariably entail recursive equations of the form p (4) q φj yt−j = j =0 θj xt−j , with φ0 = 1, j =0 of which the z -transform is (5) φ(z )y (z ) = θ(z )x(z ), wherein φ(z ) = φ0 + φ1 z + · · · + φp z p and θ(z ) = θ0 + θz + · · · + θq z q are finitedegree polynomials. The leading coefficient of φ(z ) may be set to unity without loss of generality; and thus the output sequence y (t) in equation (4) becomes a function not only of past and present inputs but also of past outputs, which are described as feedback. The recursive equation may be assimilated to the equation under (2) by writing it in rational form: (6) y (z ) = θ(z ) x(z ) = ψ (z )x(z ). φ(z ) On the condition that the filter is stable, the expression ψ (z ) stands for the series expansion of the ratio of the polynomials. The stability of a rational transfer function θ(z )/φ(z ) can be investigated via its partial-fraction decomposition, which gives rise to a sum of simpler transfer functions that can be analysed readily. If the degree of the numerator of θ(z ) exceeds that of the denominator φ(z ), then long division can be used to obtain a quotient polynomial and a remainder that is a proper rational function. The quotient polynomial will correspond to a stable transfer function; and the remainder will be the subject of the decomposition. Assume that θ(z )/φ(z ) is a proper rational function in which the denominator is factorised as r (7) (1 − z/λj )nj , φ(z ) = j =1 where nj is the multiplicity of the root λj , and where j nj = p is the degree of the polynomial. Then, the so-called Heaviside partial-fraction decomposition is (9) θ(z ) = φ(z ) r nj j =1 k=1 4 cjk ; (1 − z/λj )k D.S.G. POLLOCK: Filtering Macroeconomic Data 2 1.5 1 0.5 0 −0.5 −1 −1.5 0 5 10 15 Figure 1. The impulse response of the transfer function θ (z )/φ(z ) with φ(z ) = 1.0 − 1.2728z + 0.81z 2 and θ(z )(z ) = 1.0 + 0.0.75z . and the task is to find the series expansions of the partial fractions. The stability of the transfer function depends upon the convergence of these expansions. For this, the necessary and sufficient condition is that |λj | > 1 for all j , which is to to say that all of the roots of the denominator polynomial must lie outside the unit circle in the complex plane. The expansions of a pair of partial fractions with conjugate complex roots will combine to produce a sinusoidal sequence. The expansion of a partial fraction containing a root of multiplicity n will be equivalent to the n-fold auto-convolution of the expansion of a simple fraction containing the root. It is helpful to represent the roots of the denominator polynomial, which are described as the poles of the transfer function, together with the roots of the numerator polynomial, which are described as the zeros, by showing their locations graphically within the complex plane. It is more convenient to represent the poles and zeros of θ(z −1 )/φ(z −1 ), which are the reciprocals of those of θ(z )/φ(z ). For a stable and invertible transfer function, these must lie within the unit circle. This recourse has been adopted for Figure 2, which shows the pole–zero diagram for the transfer function that gives rise to Figure 1. The Series Expansion of a Rational Transfer Function The method of finding the coefficients of the series expansion can be illustrated by the second-order case: (9) θ 0 + θ1 z = ψ0 + ψ1 z + ψ2 z 2 + · · · . φ0 + φ1 z + φ2 z 2 We rewrite this equation as (10) θ0 + θ1 z = φ0 + φ1 z + φ2 z 2 5 ψ0 + ψ1 z + ψ2 z 2 + · · · . D.S.G. POLLOCK: Filtering Macroeconomic Data Im i −1 Re 1 −i Figure 2. The pole–zero diagram corresponding to the transfer function of Figure 1. The poles are conjugate complex numbers with arguments of ±π/4 and with a modulus of 0.9. The single real-valued zero has the value of −0.75. The following table assists us in multiplying together the two polynomials: ψ0 ψ2 z 2 ··· φ0 φ0 ψ0 φ0 ψ1 z φ0 ψ2 z 2 ··· φ1 z φ1 ψ0 z φ1 ψ1 z 2 φ1 ψ2 z 3 ··· φ2 z 2 (11) ψ1 z φ2 ψ0 z 2 φ2 ψ1 z 3 φ2 ψ2 z 4 ··· By performing the multiplication on the RHS of equation (10), and by equating the coefficients of the same powers of z on the two sides, we find that (12) θ0 = φ0 ψ0 , θ1 = φ0 ψ1 + φ1 ψ0 , 0 = φ0 ψ2 + φ1 ψ1 + φ2 ψ0 , . . . 0 = φ0 ψn + φ1 ψn−1 + φ2 ψn−2 , ψ0 = θ0 /φ0 , ψ1 = (θ1 − φ1 ψ0 )/φ0 , ψ2 = −(φ1 ψ1 + φ2 ψ0 )/φ0 , . . . ψn = −(φ1 ψn−1 + φ2 ψn−2 )/φ0 . Bi-directional (Non causal) Filters A two-sided symmetric filter in the form of (13) ψ (z ) = θ(z −1 )θ(z ) = ψ0 + ψ1 (z −1 + z ) + · · · + ψm (z −m + z m ) 6 D.S.G. POLLOCK: Filtering Macroeconomic Data is often employed in smoothing the data or in eliminating its seasonal components. The advantage of such a filter is the absence of a phase effect. That is to say, no delay is imposed on any of the components of the signal. The so-called Cram´r–Wold factorisation, which sets ψ (z ) = θ(z −1 )θ(z ), e and which must be available for any properly-designed filter, provides a straightforward way of explaining the absence of a phase effect. The factorisation gives rise to two equations (i) q (z ) = θ(z )y (z ) and (ii) x(z ) = θ(z −1 )q (z ). Thus, the transformation of (1) to be broken down into two operations: (14) (i) qt = θj yt−j and (ii) xt = j θj qt+j . j The first operation, which runs in real time, imposes a time delay on every component of x(t). The second operation, which works in reversed time, imposes an equivalent reverse-time delay on each component. The reverse-time delays, which are advances in other words, serve to eliminate the corresponding real-time delays. If ψ (z ) corresponds to an FIR filter, then the processed sequence x(t) may be generated via a single application of the two-sided filter ψ (z ) to the signal y (t), or it may be generated in two operations via the successive applications of θ(z ) to y (z ) and θ(z −1 ) to q (z ) = θ(z )y (z ). The question of which of these techniques has been used to generate y (t) in a particular instance should be a matter of indifference. The final species of linear filter that may be used in the processing of economic time series is a symmetric two-sided rational filter of the form (15) ψ (z ) = θ(z −1 )θ(z ) . φ(z −1 )φ(z ) Such a filter must, of necessity, be applied in two separate passes running forwards and backwards in time and described, respectively, by the equations (16) (i) φ(z )q (z ) = θ(z )y (z ) and (ii) φ(z −1 )x(z ) = θ(z −1 )q (z ). Such filters represent a most effective way of processing economic data in pursuance of a wide range of objectives. The Response to a Sinusoidal Input One must also consider the response of the transfer function to a simple sinusoidal signal. Any finite data sequence can be expressed as a sum of discretely sampled sine and cosine functions with frequencies that are integer multiples of a fundamental frequency that produces one cycle in the period spanned by the sequence. The finite sequence may be regarded as a single cycle within a infinite sequence, which is the periodic extension of the data. 7 D.S.G. POLLOCK: Filtering Macroeconomic Data Im β λ θ Re −θ α ρ λ* Figure 3. The Argand Diagram showing a complex number λ = α + iβ and its conjugate λ∗ = α − iβ . Consider, therefore, the consequences of mapping the perpetual signal sequence {xt = cos(ωt)} through the transfer function with the coefficients {ψ0 , ψ1 , . . .}. The output is (17) ψj cos ω [t − j ] . y (t) = j By virtue of the trigonometrical identity cos(A − B ) = cos A cos B + sin A sin B , this becomes y (t) = (18) ψj cos(ωj ) cos(ωt) + j ψj sin(ωj ) sin(ωt) j = α cos(ωt) + β sin(ωt) = ρ cos(ωt − θ), Observe that using the trigonometrical identity to expand the final expression of (18) gives α = ρ cos(θ) and β = ρ sin(θ). Therefore, (19) ρ2 = α2 + β 2 and θ = tan−1 β . α Also, if λ = α + iβ and λ∗ = α − iβ are conjugate complex numbers, then ρ would be their modulus. This is illustrated in Figure 3. It can be seen, from (18), that the transfer function has a twofold effect upon the signal. First, there is a gain effect, whereby the amplitude of the sinusoid is increased or diminished by the factor ρ. Then, there is a phase effect, whereby the peak of the sinusoid is displaced by a time delay of θ/ω periods. The frequency of the output is the same as the frequency of the input, which is a fundamental feature of all linear dynamic systems. 8 D.S.G. POLLOCK: Filtering Macroeconomic Data 1.0 0.5 1 2 3 4 −0.5 −1.0 Figure 4. The values of the function cos{(11/8)πt} coincide with those of its alias cos{(5/8)πt} at the integer points {t = 0, ±1, ±2, . . .}. Observe that the response of the transfer function to a sinusoid of a particular frequency is akin to the response of a bell to a tuning fork. It gives very limited information regarding the characteristics of the system. To obtain full information, it is necessary to excite the system over a full range of frequencies. Aliasing and the Shannon–Nyquist Sampling Theorem In a discrete-time system, there is a problem of aliasing whereby signal frequencies (i.e. angular velocities) in excess of π radians per sampling interval are confounded with frequencies within the interval [0, π ]. To understand this, consider a cosine wave of unit amplitude and zero phase with a frequency ω in the interval π < ω < 2π that is sampled at unit intervals. Let ω ∗ = 2π − ω . Then, cos(ωt) = cos (2π − ω ∗ )t (20) = cos(2π ) cos(ω ∗ t) + sin(2π ) sin(ω ∗ t) = cos(ω ∗ t); which indicates that ω and ω ∗ are observationally indistinguishable. Here, ω ∗ ∈ [0, π ] is described as the alias of ω > π . The maximum frequency in discrete data is π radians per sampling interval and, as the Shannon–Nyquist sampling theorem indicates, aliasing is avoided only if there are at least two observations in the time that it takes the signal element of highest frequency to complete a cycle. In that case, the discrete representation will contain all of the available information on the system. The consequences of sampling at an insufficient rate are illustrated in Figure 4. Here, a rapidly alternating cosine function is mistaken for one of less than half the true frequency. The sampling theorem is attributable to several people, but it is most commonly attributed to Shannon (1949, 1989), albeit that Nyquist (1928) discovered the essential results at an earlier date. 9 D.S.G. POLLOCK: Filtering Macroeconomic Data 25 20 15 10 5 0 0 π/4 π/2 3π/4 π Figure 5. The spectral density function of the ARMA(2, 1) process y (t) = 1.2728y (t − 1) − 0.81y (t − 2) + ε(t) + 0.0.75ε(t − 1) with V {ε(t)} = 1. The Frequency Response of a Linear Filter The frequency response of a linear filter ψ (z ) is its response to the set of sinusoidal inputs of all frequencies ω that fall within the Nyquist interval [0, π ]. This entails the squared gain of the filter, defined by 2 2 ρ2 (ω ) = ψα (ω ) + ψβ (ω ), (21) where (22) ψα (ω ) = ψj cos(ωj ) and ψβ (ω ) = j ψj sin(ωj ), j and the phase displacement, defined by (23) θ(ω ) = Arg{ψ (ω )} = tan−1 {ψβ (ω )/ψα (ω )}. It is convenient to replace the trigonometrical functions of (22) by the complex exponential functions 1 1 (24) eiωj = {cos(ωj ) + sin(ωj )} and e−iωj = {cos(ωj ) − sin(ωj )}, 2 2 which enable the trigonometrical functions to be expressed as i 1 (25) cos(ωt) = {eiωj + e−iωj } and sin(ωj ) = {e−iωj − eiωj }. 2 2 Setting z = exp{−iωj } in ψ (z ) gives (26) ψ (e−iωj ) = ψα (ω ) − iψβ (ω ), which we shall write hereafter as ψ (ω ) = ψ (e−iωj ). The squared gain of the filter, previously denoted by ρ2 (ω ), is the square of the complex modulus: (27) 2 2 |ψ (ω )|2 = ψα (ω ) + ψβ (ω ), which is obtained by setting z = exp{−iωj } in ψ (z −1 )ψ (z ). 10 D.S.G. POLLOCK: Filtering Macroeconomic Data The Spectrum of a Stationary Stochastic Process Consider a stationary stochastic process y (t) = {yt ; t = 0, ±1, ±2, . . .} defined on a doubly-infinite index set. The generic element of the process can be expressed as yt = j ψj εt−j , where εt is an element of a sequence ε(t) of independently and identically distributed random variables with E (εt ) = 0 and V (εt ) = σ 2 for all t. The autocovariance generating function of the process is (28) σ 2 ψ (z −1 )ψ (z ) = γ (z ) = {γ0 + γ1 (z −1 + z ) + γ2 (z −2 + z 2 ) + · · ·}. The following table assists us in forming the product γ (z ) = σ 2 ψ (z −1 )ψ (z ): ψ0 ψ2 z 2 ··· ψ0 2 ψ0 ψ0 ψ1 z ψ0 ψ2 z 2 ··· ψ1 z −1 ψ1 ψ0 z −1 2 ψ1 ψ1 ψ2 z ··· ψ2 z −2 . . . (29) ψ1 z ψ2 ψ0 z −2 . . . ψ2 ψ1 z −1 . . . 2 ψ2 . . . ··· The autocovariances are obtained by summing along the NW–SE diagonals: 2 2 2 2 γ0 = σ 2 {ψ0 + ψ1 + ψ2 + ψ3 + · · ·}, (30) γ1 = σ 2 {ψ0 ψ1 + ψ1 ψ2 + ψ2 ψ3 + · · ·}, γ2 = σ 2 {ψ0 ψ2 + ψ1 ψ3 + ψ2 ψ4 + · · ·}, . . . By setting z = exp{−iωj } and dividing by 2π , we get the spectral density function, or spectrum, of the process: ∞ (31) 1 f (ω ) = γτ cos(ωτ ) . γ0 + 2 2π τ =1 This entails the cosine Fourier transform of the sequence of autocovariances. The spectral density functions of an ARMA (2, 1) process, which incorporates the transfer function of Figures 1–3, is shown in Figure 5. Wiener–Kolmogorov Filtering of Stationary Sequences The classical theory of linear filtering was formulated independently by Norbert Wiener (1941) and Andrei Nikolaevich Kolmogorov (1941) during the Second World War. They were both considering the problem of how to target radar-assisted anti-aircraft guns on incoming enemy aircraft. 11 D.S.G. POLLOCK: Filtering Macroeconomic Data The purpose of a Wiener–Kolmogorov (W–K) filter is to extract an estimate of a signal sequence ξ (t) from an observable data sequence (32) y (t) = ξ (t) + η (t), which is afflicted by the noise η (t). According to the classical assumptions, which we shall later amend in order to accommodate short nonstationary sequences, the signal and the noise are generated by zero-mean stationary stochastic processes that are mutually independent. Also, the assumption is made that the data constitute a doubly-infinite sequence. It follows that the autocovariance generating function of the data is the sum of the autocovariance generating functions of its two components. Thus, (33) γ yy (z ) = γ ξξ (z ) + γ ηη (z ) and γ ξξ (z ) = γ yξ (z ). These functions are amenable to the so-called Cram´r–Wold factorisation, and e they may be written as (34) γ yy (z ) = φ(z −1 )φ(z ), γ ξξ (z ) = θ(z −1 )θ(z ), γ ηη (z ) = θη (z −1 )θη (z ). The estimate xt of the signal element ξt , generated by a linear timeinvariant filter, is a linear combination of the elements of the data sequence: (35) xt = ψj yt−j . j The principle of minimum mean-squared error estimation indicates that the estimation errors must be statistically uncorrelated with the elements of the information set. Thus, the following condition applies for all k : 0 = E yt−k (ξt − xt ) (36) = E (yt−k ξt ) − ψj E (yt−k yt−j ) j = yξ γk yy ψj γk−j . − j The equation may be expressed, in terms of the z -transforms, as (37) γ y (z ) = ψ (z )γ yy (z ). It follows that (38) γ yξ (z ) ψ (z ) = yy γ (z ) θ(z −1 )θ(z ) γ ξξ (z ) = . = ξξ γ (z ) + γ ηη (z ) φ(z −1 )φ(z ) 12 D.S.G. POLLOCK: Filtering Macroeconomic Data Now, by setting z = exp{−iω }, one can derive the frequency-response function of the filter that is used in estimating the signal ξ (t). The effect of the filter is to multiply each of the frequency elements of y (t) by the fraction of its variance that is attributable to the signal. The same principle applies to the estimation of the residual or noise component. This is obtained using the complementary filter (39) ψ c (z ) = 1 − ψ (z ) = γ ηη (z ) . γ ξξ (z ) + γ ηη (z ) The estimated signal component may be obtained by filtering the data in two passes according to the following equations: (40) φ(z )q (z ) = θ(z )y (z ), φ(z −1 )x(z −1 ) = θ(z −1 )q (z −1 ). The first equation relates to a process that runs forwards in time to generate the elements of an intermediate sequence, represented by the coefficients of q (z ). The second equation represents a process that runs backwards to deliver the estimates of the signal, represented by the coefficients of x(z ). The Hodrick–Prescott (Leser) Filter and the Butterworth Filter The Wiener–Kolmogorov methodology can be applied to nonstationary data with minor adaptations. A model of the processes underlying the data can be adopted that has the form of (41) ∇d (z )y (z ) = ∇d (z ){ξ (z ) + η (z )} = δ (z ) + κ(z ) = (1 + z )n ζ (z ) + (1 − z )m ε(z ), where ζ (z ) and ε(z ) are the z -transforms of two independent white-noise sequences ζ (t) and ε(t) and where ∇ = 1 − z is the z -transform of the difference operator. The model of y (t) = ξ (t) + η (t) entails a pair of statistically independent stochastic processes, which are defined over the doubly-infinite sequence of integers and of which the z -transforms are (42) (1 + z )n ζ (z ) ξ (z ) = ∇d (z ) (1 − z )m and η (z ) = ε(z ). ∇d (z ) The condition m ≥ d is necessary to ensure the stationarity of η (t), which is obtained from ε(t) by differencing m − d times. It must be conceded that a nonstationary process such as ξ (t) is a mathematical construct of doubtful reality, since its values will be unbounded, almost certainly. Nevertheless, to deal in these terms is to avoid the complexities of the finite-sample approach, which will be the subject of the next section. 13 D.S.G. POLLOCK: Filtering Macroeconomic Data The filter that is applied to y (t) to estimate ξ (t), which is the d-fold integral of δ (t), takes the form of (43) 2 σζ (1 + z −1 )n (1 + z )n ψ (z ) = 2 , 2 σζ (1 + z −1 )n (1 + z )n + σε (1 − z −1 )m (1 − z )m regardless of the degree d of differencing that would be necessary to reduce y (t) to stationarity. Two special cases are of interest. By setting d = m = 2 and n = 0 in (41), a model is obtained of a second-order random walk ξ (t) affected by white-noise errors of observation η (t) = ε(t). The resulting lowpass W–K filter, in the form of (44) ψ (z ) = 1 1 + λ(1 − z −1 )2 (1 − z )2 with λ = 2 ση 2, σδ is the Hodrick–Prescott (H–P) filter. The complementary highpass filter, which generates the residue, is ψ c (z ) = (45) (1 − z −1 )2 (1 − z )2 . λ−1 + (1 − z −1 )2 (1 − z )2 Here, λ, which is described as the smoothing parameter, is the single adjustable parameter of the filter. By setting m = n, a filter for estimating ξ (t) is obtained that takes the form of ψ (z ) = (46) = 2 σζ (1 + z −1 )n (1 + z )n 2 2 σζ (1 + z −1 )n (1 + z )n + σε (1 − z −1 )n (1 − z )n 1 1−z 1+λ i 1+z with λ = 2n 2 σε 2. σζ This is the formula for the Butterworth lowpass digital filter. The filter has two adjustable parameters, and, therefore, it is a more flexible device than the H–P filter. First, there is the parameter λ. This can be expressed as (47) λ = {1/ tan(ωd )}2n , where ωd is the nominal cut-off point of the filter, which is the mid point in the transition of the filter’s frequency response from its pass band to its stop band. The second of the adjustable parameters is n, which denotes the order of the filter. As n increases, the transition between the pass band and the stop band becomes more abrupt. 14 D.S.G. POLLOCK: Filtering Macroeconomic Data 1 0.75 0.5 0.25 0 0 π/4 π/2 3π/4 π Figure 6. The gain of the Hodrick–Prescott lowpass filter with a smoothing parameter set to 100, 1,600 and 14,400. These filters can be applied to the nonstationary data sequence y (t) in the bidirectional manner indicated by equation (40), provided that the appropriate initial conditions are supplied with which to start the recursions. However, by concentrating on the estimation of the residual sequence η (t), which corresponds to a stationary process, it is possible to avoid the need for nonzero initial conditions. Then, the estimate of η (t) can be subtracted from y (t) to obtain the estimate of ξ (t). The H–P filter has been used as a lowpass smoothing filter in numerous macroeconomic investigations, where it has been customary to set the smoothing parameter to certain conventional values. Thus, for example, the econometric computer package Eviews 4.0 (2000) imposes the following default values: ⎧ for annual data, ⎪ 100 ⎨ λ = 1,600 for quarterly data, ⎪ ⎩ 14,400 for monthly data. Figure 6 shows the square gain of the filter corresponding to these values. The innermost curve corresponds to λ = 14,400 and the outermost curve to λ = 100. Whereas they have become conventional, these values are arbitrary. The filter should be adapted to the purpose of isolating the component of interest; and the appropriate filter parameters need to be determined in the light of the spectral structure of the component, such as has been revealed in Figure 10, in the case of the U.K. consumption data. It will be observed that an H–P filter with λ = 1,600, which defines the middle curve in Figure 6, will not be effective in isolating the low-frequency component of the quarterly consumption data of Figure 9, which lies in the interval [0, π/8]. The curve will cut through the low-frequency spectral structure that is represented in Figure 10; and the effect will be greatly to attenuate some of the elements of the component that should be preserved intact. 15 D.S.G. POLLOCK: Filtering Macroeconomic Data 1 0.75 0.5 0.25 0 0 π/4 π/2 3π/4 π Figure 7. The squared gain of the lowpass Butterworth filters of orders n = 6 and n = 12 with a nominal cut-off point of 2π/3 radians. Lowering the value of λ in order to admit a wider range of frequencies will have the effect of creating a frequency response with a gradual transition from the pass band to the stop band. This will be equally inappropriate to the purpose of isolating a component within a well-defined frequency band. For that purpose, a different filter is required. A filter that may be appropriate to the purpose of isolating the lowfrequency fluctuations in consumption is the Butterworth filter. The squared gain of the latter is illustrated in Figure 7. In this case, there is a well-defined nominal cut-off frequency, which is at the mid point of the transition from the pass band to the stop band. The transition becomes more rapid as the filter order n increases. If a perfectly sharp transition is required, then the frequency-domain filter that will be presented later should be employed. The Hodrick–Prescott filter has many antecedents. Its invention cannot reasonably be attributed to Hodrick and Prescott (1980, 1997), who cited Whittaker (1923) as one of their sources. Leser (1961) also provided a complete derivation of the filter at an earlier date. The analogue Butterworth filter is a commonplace of electrical engineering. The digital version has been described by Pollock (2000). Wiener–Kolmogorov Filters for Finite Sequences The classical Wiener–Kolmogorov theory can be adapted to finite data sequences generated by stationary stochastic processes. Consider a data vector y = [y0 , y1 , . . . , yT −1 , ] that has a signal component ξ and a noise component η : (48) y = ξ + η. The two components are assumed to be independently normally distributed 16 D.S.G. POLLOCK: Filtering Macroeconomic Data with zero means and with positive-definite dispersion matrices. Then, E (ξ ) = 0, E (η ) = 0, (49) D(ξ ) = Ωξ , D(η ) = Ωη , and C (ξ, η ) = 0. The dispersion matrices Ωξ and Ωη may be obtained from the autocovariance generating functions γξ (z ) and γη (z ), respectively, by replacing z by the matrix argument LT = [e1 , e2 , . . . , eT −1 , 0], which is the finite sampleversion of the lag operator. This is obtained from the identity matrix IT = [e0 , e1 , e2 , . . . , eT −1 ] by deleting the leading column and by appending a zero vector to the end of the array. Negative powers of z are replaced by powers of the forwards shift operator FT = LT . A consequence of the independence of ξ and η is that D(y ) = Ωξ + Ωη . We may begin by considering the determination of the vector of the T filter coefficients ψt. = [ψt,0 , ψt,1 , . . . , ψt,T −1 ] that determine xt , which is the tth element of the filtered vector x = [x0 , x1 , . . . , xT −1 ] and which is the estimate of ξt . This is derived from the data in y = [y0 , y1 , . . . , yT −1 ] via the equation T −1−t xt = (50) ψt,t+j yt−j . j =−t The principle of minimum mean-squared error estimation continues to indicate that the estimation errors must be statistically uncorrelated with the elements of the information set. Thus 0 = E yt−k (ξt − xt ) T −1−t = E (yt−k ξt ) − (51) ψt,t+j E (yt−k yt−j ) j =−t T −1+t ξξ = γ−k − j =−t yy ψt,t+j γj −k . yξ ξξ Here, E (yt−k ξt ) = γ−k = γ−k in accordance with (33). Equation (51) can be rendered also in a matrix format. By running from k = −t to k = T − 1 − t, ξξ ξξ and observing that γ−k = γk , we get the following system: ⎡ (52) ξξ γt ⎤ ⎡ yy γ0 ⎢ ξξ ⎥ ⎢ yy ⎢ γt−1 ⎥ ⎢ γ1 ⎥=⎢ ⎢ ⎥⎢. ⎢ . . ⎦⎣. ⎣ . . yy ξξ γT −1 γT −1−t yy γ1 yy γ0 . . . yy γT −2 17 ⎤⎡ ψ ⎤ t,0 ⎥ yy ⎥ ⎢ · · · γT −2 ⎥ ⎢ ψt,1 ⎥ ⎥⎢ ⎥ . ⎥⎢ . ⎥. .. . ⎦⎣ . ⎦ . . . yy · · · γ0 ψt,T −1 yy · · · γT −1 D.S.G. POLLOCK: Filtering Macroeconomic Data This equation above can be written in summary notation as Ωξ et = Ωy ψt. , where et is a vector of order T containing a single unit preceded by t zeros and followed by T − 1 − t zeros. The coefficient vector ψt. is given by (53) ψt. = et Ωξ Ω−1 = et Ωξ (Ωξ + Ωη )−1 , y and the estimate of ξt is xt = ψt. y . The estimate of the complete vector ξ = [ξ0 , ξ1 , . . . , ξT −1 ] of the signal elements is (54) x = Ωξ Ω−1 y = Ωξ (Ωξ + Ωη )−1 y. y The Estimates as Conditional Expectations The linear estimates of (54) have the status of conditional expectations, when the vectors ξ and y are normally distributed. As such, they are, unequivocally, the optimal minimum mean-squared error predictors of the signal and the noise components: (55) E (ξ |y ) = E (ξ ) + C (ξ, y )D−1 (y ){y − E (y )} = Ωξ (Ωξ + Ωη )−1 y = x, (56) E (η |y ) = E (η ) + C (η, y )D−1 (y ){y − E (y )} = Ωη (Ωξ + Ωη )−1 y = h. The corresponding error dispersion matrices, from which confidence intervals for the estimated components may be derived, are (57) D(ξ |y ) = D(ξ ) − C (ξ, y )D−1 (y )C (y, ξ ) = Ωξ − Ωξ (Ωξ + Ωη )−1 Ωξ , (58) D(η |y ) = D(η ) − C (η, y )D−1 (y )C (y, η ), = Ωη − Ωη (Ωξ + Ωη )−1 Ωη . The Least-Squares Derivation of the Estimates The estimates of ξ and η , which have been denoted by x and h respectively, can also be derived according to the following criterion: (59) Minimise S (ξ, η ) = ξ Ω−1 ξ + η Ω−1 η η ξ subject to ξ + η = y. Since S (ξ, η ) is the exponent of the normal joint density function N (ξ, η ), the resulting estimates may be described, alternatively, as the minimum chi-square estimates or as the maximum-likelihood estimates. Substituting for η = y − ξ gives the concentrated criterion function S (ξ ) = −1 ξ Ωξ ξ + (y − ξ ) Ω−1 (y − ξ ). Differentiating this function in respect of ξ and η 18 D.S.G. POLLOCK: Filtering Macroeconomic Data 6 4 2 0 0 π/4 π/2 3π/4 π Figure 8. The squared gain of the difference operator, which has a zero at zero frequency, and the squared gain of the summation operator, which is unbounded at zero frequency. setting the result to zero gives the following condition of minimisation: 0 = Ω−1 x − Ω−1 (y − x). From this, it follows that y = (Ωξ + Ωη )Ω−1 x. Therefore, η ξ ξ the solution for x is (60) x = Ωξ (Ωξ + Ωη )−1 y. Moreover, since the roles of ξ and η are interchangeable in this exercise, and since h + x = y , there are also (61) h = Ωη (Ωξ + Ωη )−1 y and x = y − Ωη (Ωξ + Ωη )−1 y. The filter matrices Ψξ = Ωξ (Ωξ + Ωη )−1 and Ψη = Ωη (Ωξ + Ωη )−1 of (60) and (61) are the matrix analogues of the z -transforms displayed in equations (38) and (39). A simple procedure for calculating the estimates x and h begins by solving the equation (62) (Ωξ + Ωη )b = y for the value of b. Thereafter, one can generate (63) x = Ωξ b and h = Ωη b. If Ωξ and Ωη correspond to the narrow-band dispersion matrices of movingaverage processes, then the solution to equation (62) may be found via a Cholesky factorisation that sets Ωξ + Ωη = GG , where G is a lower-triangular matrix with a limited number of nonzero bands. The system GG b = y may be cast in the form of Gp = y and solved for p. Then, G b = p can be solved for b. 19 D.S.G. POLLOCK: Filtering Macroeconomic Data The Difference and Summation Operators A simple expedient for eliminating the trend from the data sequence y (t) = {yt ; t = 0, ±1, ±2, . . .} is to replace the sequence by its differenced version y (t) − y (t − 1) or by its twice differenced version y (t) − 2y (t − 1) + y (t − 2). Differences of higher orders are rare. The z -transform of the difference is (1 − z )y (z ) = y (z ) − zy (z ). On defining the operator ∇(z ) = 1 − z , the second differences can be expressed as ∇2 (z )y (t) = (1 − 2z + z 2 )y (z ). The inverse of the difference operator is the summation operator Σ(z ) = (1 − z )−1 = {1 + z + z 2 + · · ·}. (64) The z -transform of the d-fold summation operator is as follows: (65) Σd (z ) = 1 d(d + 1) 2 d(d + 1)(d + 2) 3 = 1 + dz + z+ z + ···. (1 − z )d 2! 3! The difference operator has a powerful effect upon the data. It nullifies the trend and it severely attenuates the elements of the data that are adjacent in frequency to the zero frequency of the trend. It also amplifies the high frequency elements of the data. The effect is apparent in Figure 8, which shows the squared gain of the difference operator. The figure also shows the squared gain of the summation operator, which gives unbounded power to the elements that have frequencies in the vicinity of zero. In dealing with a finite sequence, it is appropriate to consider a matrix version of the difference operator. In the case of a sample of T elements comprised by the vector y = [y0 , y1 , . . . , yT −1 ] , it is appropriate to use the matrix difference operator ∇(LT ) = IT − LT , which is obtained by replacing z within ∇(z ) = 1 − z by the matrix argument LT = [e1 , e2 , . . . , eT −1 , 0], which is obtained from the identity matrix IT = [e0 , e1 , e2 , . . . , eT −1 ] by deleting the leading column and by appending a zero vector to the end of the array. Examples of the first-order and second-order matrix difference operators are as follows: ⎡ ⎤ ⎡ ⎤ 1 0 00 1 0 00 0 0⎥ 0 0⎥ ⎢ −1 1 ⎢ −2 1 2 (66) ∇4 = ⎣ ⎦ , ∇4 = ⎣ ⎦. 0 −1 1 0 1 −2 1 0 0 0 −1 1 0 1 −2 1 The corresponding inverse matrices are ⎡ ⎤ ⎡ 1000 1 ⎢1 1 0 0⎥ ⎢2 2 (67) Σ4 = ⎣ ⎦ , Σ4 = ⎣ 1110 3 1111 4 0 1 2 3 0 0 1 2 ⎤ 0 0⎥ ⎦. 0 1 It will be seen that the elements of the leading vectors of these matrices are the coefficients associated with the expansion of Σd (z ) of (65) for the cases of d = 1 and d = 2. The same will be true for higher orders of d. 20 D.S.G. POLLOCK: Filtering Macroeconomic Data Polynomial Interpolation The first p columns of the matrix Σp provide a basis of the set of polynoT mials of degree p − 1 defined on the set of integers t = 0, 1, 2, . . . , T − 1. An example is provided by the first three columns of the matrix Σ3 , which may be 4 transformed as follows: ⎡ ⎤ ⎤ ⎡1 1 1 ⎤ 100⎡ 1 11 ⎢ 3 1 0⎥⎣ ⎢1 2 4 ⎥ (68) ⎣ ⎦ −2 −1 1 ⎦ = ⎣ ⎦. 631 139 1 00 10 6 3 1 4 16 The first column of the matrix on the LHS contains the ordinates of the quadratic function (t2 + t)/2. The columns of the transformed matrix are recognisably the ordinates of the powers t0 , t1 and t2 corresponding to the integers t = 1, 2, 3, 4. The natural extension of the matrix to T rows provides a basis for the quadratic functions q (t) = at2 + bt + c defined on T consecutive integers. The matrix of the powers of the integers is notoriously ill-conditioned. In calculating polynomial regressions of any degree in excess of the cubic, it is advisable to employ a basis of orthogonal polynomials, for which purpose some specialised numerical procedures are available. (See Pollock 1999.) In the present context, which concerns econometric data sequences, the degrees of differencing and summation rarely exceed two. Nevertheless, it is appropriate to consider the algebra of the general case. Consider, therefore, the matrix that takes the p-th difference of a vector of order T , which is ∇p = (I − LT )p . T (69) This matrix can be partitioned so that ∇p = [Q∗ , Q] , where Q∗ has p rows. If T y is a vector of T elements, then Q∗ g y= ∗ ; Q g ∇p y = T (70) and g∗ is liable to be discarded, whereas g will be regarded as the vector of the p-th differences of the data. The inverse matrix may be partitioned conformably to give ∇−p = [S∗ , S ]. T It follows that (71) [ S∗ S] Q∗ Q = S∗ Q∗ + SQ = IT , and that (72) Q∗ [ S∗ Q S]= Q∗ S∗ Q S∗ 21 Q∗ S QS = Ip 0 0 IT −p . D.S.G. POLLOCK: Filtering Macroeconomic Data If g∗ is available, then y can be recovered from g via (73) y = S∗ g∗ + Sg. Since the submatrix S∗ , provides a basis for all polynomials of degree p − 1 that are defined on the integer points t = 0, 1, . . . , T − 1, it follows that S∗ g∗ = S∗ Q∗ y contains the ordinates of a polynomial of degree p − 1, which is interpolated through the first p elements of y , indexed by t = 0, 1, . . . , p − 1, and which is extrapolated over the remaining integers t = p, p + 1, . . . , T − 1. A polynomial that is designed to fit the data should take account of all of the observations in y . Imagine, therefore, that y = φ + η , where φ contains the ordinates of a polynomial of degree p − 1 and η is a disturbance term 2 with E (η ) = 0 and D(η ) = ση IT . Then, in forming an estimate x = S∗ r∗ of φ, we should minimise the sum of squares η η . Since the polynomial is fully determined by the elements of a starting-value vector r∗ , this is a matter of minimising (74) (y − x) (y − x) = (y − S∗ r∗ ) (y − S∗ r∗ ) with respect to r∗ . The resulting values are (75) r∗ = (S∗ S∗ )−1 S∗ y and x = S∗ (S∗ S∗ )−1 S∗ y. An alternative representation of the estimated polynomial is available. This is provided by the identity (76) S∗ (S∗ S∗ )−1 S∗ = I − Q(Q Q)−1 Q . To prove this identity, consider the fact that Z = [Q, S∗ ] is square matrix of full rank and that Q and S∗ are mutually orthogonal such that Q S∗ = 0. Then Z (Z Z )−1 Z = [ Q (77) S∗ ] (Q Q)−1 0 0 (S∗ S )−1 Q S∗ = Q(Q Q)−1 Q + S∗ (S∗ S∗ )−1 S∗ . The result of (76) follows from the fact that Z (Z Z )−1 Z = Z (Z −1 Z −1 )Z = I . It follows from (76) that the vector of the ordinates of the polynomial regression is also given by (78) x = y − Q(Q Q)−1 Q y . Polynomial Regression and Trend Extraction The use of polynomial regression in a preliminary detrending of the data is an essential part of a strategy for determining an appropriate representation 22 D.S.G. POLLOCK: Filtering Macroeconomic Data 11.5 11 10.5 10 0 50 100 150 Figure 9. The quarterly series of the logarithms of consumption in the U.K., for the years 1955 to 1994, together with a linear trend interpolated by least-squares regression. 0.01 0.0075 0.005 0.0025 0 0 π/4 π/2 3π/4 π Figure 10. The periodogram of the residual sequence obtained from the linear detrending of the logarithmic consumption data. of the underlying trajectory of an econometric data sequence. Once the trend has been eliminated from the data, one can proceed to assess their spectral structure by examining the periodogram of the residual sequence. Often the periodogram will reveal the existence of a cut-off frequency that bounds a low-frequency trend/cycle component and separates it from the remaining elements of the spectrum. An example is given in Figures 9 and 10. Figure 9 represents the logarithms of the quarterly data on aggregate consumption in the United Kingdom for the years 1955 to 1994. Through these data, a linear trend has been interpolated by least-squares regression. This line establishes a benchmark of constant exponential growth, against which the fluctuations of consumption can be measured. The periodogram of the residual sequence in plotted in Figure 10. This shows that the low-frequency structure is bounded by a frequency value of 23 D.S.G. POLLOCK: Filtering Macroeconomic Data π/8. This value can used in specifying the appropriate filter for extracting the low-frequency trajectory of the data Filters for Short Trended Sequences One way of eliminating the trend is to take differences of the data. Usually, twofold differencing is appropriate. The matrix analogue of the second-order backwards difference operator in the case of T = 5 is given by ⎡ (79) ∇2 = 5 Q∗ Q 1 0 ⎢ −2 1 ⎢ = ⎢ 1 −2 ⎢ ⎣ 0 1 0 0 0 0 1 −2 1 ⎤ 0 0⎥ ⎥ ⎥ 0 0⎥. ⎦ 10 −2 1 0 0 The first two rows, which do not produce true differences, are liable to be discarded. In general, the p-fold differences of a data vector of T elements will be obtained by pre multiplying it by a matrix Q of order (T − p) × T . Applying Q to the equation y = ξ + η , representing the trended data, gives Qy =Qξ+Qη = δ + κ = g. (80) The vectors of the expectations and the dispersion matrices of the differenced vectors are (81) E (δ ) = 0, D(δ ) = Ωδ = Q D(ξ )Q, E (κ) = 0, D(κ) = Ωκ = Q D(η )Q. The difficulty of estimating the trended vector ξ = y − η directly is that some starting values or initial conditions are required in order to define the value at time t = 0. However, since η is from a stationary mean-zero process, it requires only zero-valued initial conditions. Therefore, the starting-value problem can be circumvented by concentrating on the estimation of η . The conditional expectation of η , given the differenced data g = Q y , is provided by the formula (82) h = E (η |g ) = E (η ) + C (η, g )D−1 (g ){g − E (g )} = C (η, g )D−1 (g )g, where the second equality follows in view of the zero-valued expectations. Within this expression, there are (83) D(g ) = Ωδ + Q Ωη Q and C (η, g ) = Ωη Q. 24 D.S.G. POLLOCK: Filtering Macroeconomic Data Putting these details into (82) gives the following estimate of η : (84) h = Ωη Q(Ωδ + Q Ωη Q)−1 Q y . Putting this into the equation (85) x = E (ξ |g ) = y − E (η |g ) = y − h gives (86) x = y − Ωη Q(Ωδ + Q Ωη Q)−1 Q y . The Least-Squares Derivation of the Filter As in the case of the extraction of a signal from a stationary process, the estimate of the trended vector ξ can also be derived according to a least-squares criterion. The criterion is (87) Minimise (y − ξ ) Ω−1 (y − ξ ) + ξ QΩ−1 Q ξ . η δ The first term in this expression penalises the departures of the resulting curve from the data, whereas the second term imposes a penalty for a lack of smoothness. Differentiating the function with respect to ξ and setting the result to zero gives (88) Ω−1 (y − x) = −QΩ−1 Q x = QΩ−1 d, η δ δ where x stands for the estimated value of ξ and where d = Q x. Premultiplying by Q Ωη gives (89) Q (y − x) = Q y − d = Q Ωη QΩ−1 d, δ whence (90) Q y = d + Q Ωη QΩ−1 d δ = (Ωδ + Q Ωη Q)Ω−1 d, δ which gives (91) Ω−1 d = (Ωδ + Q Ωη Q)−1 Q y . δ Putting this into (92) x = y − Ωη QΩ−1 d, δ 25 D.S.G. POLLOCK: Filtering Macroeconomic Data which comes from premultiplying (88) by Ωη , gives x = y − Ωη Q(Ωδ + Q Ωη Q)−1 Q y . (93) which is equation (86) again. One should observe that (94) Ωη Q(Ωδ + Q Ωη Q)−1 Q y = Ωη Q(Ωδ + Q Ωη Q)−1 Q e, where e = Q(Q Q)−1 Q y is the vector of residuals obtained by interpolating a straight line through the data by a least-squares regression. That is to say, it makes no difference to the estimate of the component that is complementary to the trend whether the filter is applied to the data vector y or the residual vector e. If the trend-estimation filter is applied to e instead of to y , then the resulting vector can be added to the ordinates of the interpolated line to create the estimate of the trend. The Leser (H–P) Filter and the Butterworth Filter The specific cases that have been considered in the context of the classical form of the Wiener–Kolmogorov filter can now be adapted to the circumstances of short trended sequences. First, there is the Leser or H–P filter. This is derived by setting (95) 2 D(η ) = Ωη = ση I, 2 D(δ ) = Ωδ = σδ I and λ = 2 ση 2 σδ within (93) to give x = y − Q(λ−1 I + Q Q)−1 Q y (96) Here, λ is the so-called smoothing parameter. It will be observed that, as λ → ∞, the vector x tends to that of a linear function interpolated into the data by least-squares regression, which is represented by equation (78). The matrix expression Ψ = I − Q(λ−1 I + Q Q)−1 Q for the filter can be compared to the polynomial expression ψ c (z ) = 1 − ψ (z ) of the classical formulation, which entails the z -transform from (45). The Butterworth filter that is appropriate to short trended sequences can be represented by the equation x = y − λΣQ(M + λQ ΣQ)−1 Q y . (97) Here, the matrices (98) Σ = {2IT − (LT + LT )}n−2 and M = {2IT + (LT + LT )}n are obtained from the RHS of the equations {(1 − z )(1 − z −1 )}n−2 = {2 − (z + z −1 )}n−2 and {(1 + z )(1 + z −1 )}n = {2 + (z + z −1 )}n , respectively, by replacing 26 D.S.G. POLLOCK: Filtering Macroeconomic Data 0.15 0.1 0.05 0 −0.05 −0.1 0 50 100 150 Figure 11. The residual sequence from fitting a linear trend to the logarithmic consumption data with an interpolated line representing the business cycle, obtained by the frequency-domain method. z by LT and z −1 by LT . Observe that the equalities no longer hold after the replacements. However, it can be verified that (99) Q ΣQ = {2IT − (LT + LT )}n . Filtering in the Frequency Domain The method of Wiener–Kolmogorov filtering can also be implemented using the circulant dispersion matrices that are given by (100) ¯ Ω◦ = U γξ (D)U, ξ ¯ Ω◦ = U γη (D)U η and ¯ Ω◦ = Ω◦ + Ω◦ = U {γξ (D) + γη (D)}U, ξ η wherein the diagonal matrices γξ (D) and γη (D) contain the ordinates of the spectral density functions of the component processes. Accounts of the algebra of circulant matrices have been provided by Pollock (1999 and 2002). See, also, Gray (2002). Here, U = T −1/2 [W jt ], wherein t, j = 0, . . . , T − 1, is the matrix of the Fourier transform, of which the generic element in the j th row and tth column ¯ is W jt = exp(−i2πtj/T ), and U = T 1/2 [W −jt ] is its conjugate transpose. 2 Also, D = diag{1, W, W , . . . , W T −1 }, which replaces z within each of the autocovariance generating functions, is a diagonal matrix whose elements are the T roots of unity, which are found on the circumference of the unit circle in the complex plane. By replacing the dispersion matrices within (55) and (56) by their circulant counterparts, we derive the following formulae: (101) ¯ x = U γξ (D){γξ (D) + γη (D)}−1 U y = Pξ y, (102) ¯ h = U γη (D){γξ (D) + γη (D)}−1 U y = Pη y. 27 D.S.G. POLLOCK: Filtering Macroeconomic Data Similar replacements within the formulae (57) and (58) provide the expressions for the error dispersion matrices that are appropriate to the circular filters. The filtering formulae may be implemented in the following way. First, a Fourier transform is applied to the data vector y to give U y , which resides in the frequency domain. Then, the elements of the transformed vector are multiplied by those of the diagonal weighting matrices Jξ = γξ (D){γξ (D) + γη (D)}−1 and Jη = γη (D){γξ (D) + γη (D)}−1 . Finally, the products are carried back into the time domain by the inverse Fourier transform, which is represented by the ¯ matrix U . An example of the method of frequency filtering is provided by Figure 11, which shows the effect applying a filter with a sharp cut-off at the frequency value of π/8 radians per period to the residual sequence obtained from a linear detrending of the quarterly logarithmic consumption data of the U.K. This cut-off frequency has been chosen in reference to the periodogram of the residual sequence, which is in Figure 10. This shows that the lowfrequency structure of the data falls in the interval [0, π/8]. Apart from the prominent spike at the season frequency of π/2 and the smaller seasonal spike at the frequency of π , the remainder of the periodogram is characterised by wide spectral deadspaces. The filters described above are appropriate only to stationary processes. However, they can be adapted in several alternative ways to cater to nonstationary processes. One way is to reduce the data to stationarity by twofold differencing before filtering it. After filtering, the data may be reinflated by a process of summation. As before, let the original data be denoted by y = ξ + η and let the differenced data be g = Q y = δ + κ. If the estimates of δ = Q ξ and κ = Q η are denoted by d and k respectively, then the estimates of ξ and η will be (103) x = S∗ d∗ + Sd d∗ = (S∗ S∗ )−1 S∗ (y − Sd) where and (104) h = S∗ k∗ + Sk where k∗ = −(S∗ S∗ )−1 S∗ Sk. Here, d∗ an k∗ are the initial conditions that are obtained via the minimisation of the function (105) (y − x) (y − x) = (y − S∗ d∗ − Sd) (y − S∗ d∗ − Sd) = (S∗ k∗ + Sk ) (S∗ k∗ + Sk ) = h h. The minimisation ensures that the estimated trend x adheres as closely as possible to the data y . In the case where the data are differenced twice, there is (106) S∗ = 1 0 2 1 ... ... 28 T −1 T −2 T T −1 D.S.G. POLLOCK: Filtering Macroeconomic Data The elements of the matrix S∗ S∗ can be found via the formulae T t2 = t=1 (107) T t(t − 1) = t=1 1 T (T + 1)(2T + 1) 6 and 1 1 T (T + 1)(2T + 1) − T (T + 1). 6 2 A compendium of such results has been provided by Jolly (1961), and proofs of the present results were given by Hall and Knight (1899). A fuller account of the implementation of the frequency filter has been provided by Pollock (2009). Example. Before applying a frequency-domain filter, it is necessary to ensure that the data are free of trend. If a trend is detected, then it may be removed from the data by subtracting an interpolated polynomial trend function. A test for the presence of a trend is required that differs from the tests that are used to detect the presence of unit roots in the processes generating the data. This is provided by the significance test associated with the ordinary-least squares estimate of a linear trend. There is a simple means of calculating the adjusted sum of squares of the temporal index t = 0, 1, . . . , T − 1, which is entailed in the calculation of the slope coefficient (108) 2 yt − ( t2 − ( b= yt )2 /T . t)2 /T The formulae T −1 1 t = (T − 1)T (2T − 1) 6 2 (109) t=0 T −1 and t= t=0 T (T − 1) 2 are combined to provide a convenient means of calculating the denominator of the formula of (108): T −1 t− 2 (110) t=0 ( T −1 t=0 T t)2 = (T − 1)T (T + 1) . 12 Another means of calculating the low-frequency trajectory of the data via the frequency domain mimics the method of equation (93) by concentrating of the estimation the high-frequency component. This can be subtracted from the data to create an estimate of the complementary low-frequency trend component. However, whereas, in the case of equation (93), the differencing of the data and the re-inflation of the estimated high-frequency component are 29 D.S.G. POLLOCK: Filtering Macroeconomic Data deemed to take place in the time domain, now the re-inflation occurs in the frequency domain before the resulting vector of Fourier coefficients is transformed to the time domain. The reduction of a trended data sequence to stationarity continues to be effected by the matrix Q but, in this case, the matrix can be seen in the context of a centralised difference operator This is (111) N (z ) = z −1 − 2 + z = z −1 (1 − z )2 = z −1 ∇2 (z ). The matrix version of the operator is obtained by setting z = LT and z −1 = LT , which gives (112) N (LT ) = NT = LT − 2IT + LT . The first and the final rows of this matrix do not deliver true differences. Therefore, they are liable to be deleted, with the effect that the two end points are lost from the twice-differenced data. Deleting the rows e0 NT and eT −1 NT from NT gives the matrix Q , which can also be obtained from ∇2 = (IT − LT )2 by T deleting the matrix Q∗ , which comprises the first two rows e0 ∇2 and e1 ∇2 . In T T the case of T = 5 there is ⎤ ⎡ −2 1 0 0 0 ⎡ ⎤⎢ ⎥ Q−1 0 0⎥ ⎢ 1 −2 1 ⎥ ⎢ 1 −2 1 0 ⎥. (113) N5 = ⎣ Q ⎦ = ⎢ 0 ⎥ ⎢ Q+1 0 1 −2 1 ⎦ ⎣0 0 0 0 1 −2 On deleting the first and last elements of the vector NT y , which are Q−1 y = e1 ∇2 y and Q+1 y , respectively, we get Q y = [q1 , . . . , qT −2 ] . T The loss of the two elements from either end of the (centrally) twicedifferenced data can be overcome by supplementing the original data vector y with two extrapolated end points y−1 and yT . Alternatively, the differenced data may be supplemented by attributing appropriate values to q0 and qT −1 . These could be zeros or some combination of the adjacent values. In either case, we will obtain a vector of order T denoted by q = [q0 , q1 , . . . qT −1 ] . In describing the method for implementing a highpass filter, let Λ be the matrix that selects the appropriate ordinates of the Fourier transform γ = U q of the twice differenced data. These ordinates must be reinflated to compensate for the differencing operation, which has the frequency response (114) f (ω ) = 2 − 2 cos(ω ). The response of the anti-differencing operation is 1/f (ω ); and γ is reinflated by pre-multiplying by the diagonal matrix (115) V = diag{v0 , v1 , . . . , vT −1 }, 30 D.S.G. POLLOCK: Filtering Macroeconomic Data comprising the values vj = 1/f (ωj ); j = 0, . . . , T − 1, where ωj = 2πj/T . Let H = V Λ be the matrix that is is applied to γ = U q to generate the Fourier ordinates of the filtered vector. The resulting vector is transformed to the time domain to give (116) ¯ ¯ h = U Hγ = U HU q. It will be seen that f (ω ) is zero-valued when ω = 0 and that 1/f (ω ) is unbounded in the neighbourhood of ω = 0. Therefore, a frequency-domain reinflation is available only when there are no nonzero Fourier ordinates in this neighbourhood. That is to say, it can work only in conjunction with highpass or bandpass filtering. However, it is straightforward to construct a lowpass filter that complements the highpass filter. The low-frequency trend component that is complementary to h is (117) ¯ x = y − h = y − U HU q. Business Cycles and Spurious Cycles Econometricians continue to debate the question of how macroeconomic data sequences should be decomposed into their constituent components. These components are usually described as the trend, the cyclical component or the business cycle, the seasonal component and the irregular component. For the original data, the decomposition is usually a multiplicative one and, for the logarithmic data, the corresponding decomposition is an additive one. The filters are usually applied to the logarithmic data, in which case, the sum of the estimated components should equal the logarithmic data. In the case of the Wiener–Kolmogorov filters, and of the frequency-domain filters as well, the filter gain never exceeds unity. Therefore, every lowpass filter ψ (z ) is accompanied by a complementary highpass filter ψ c (z ) = 1 − ψ (z ). The two sequences resulting from these filters can be recombined to create the data sequence from which they have originated. Such filters can be applied sequentially to create an additive decomposition of the data. First, the tend is extracted. Then, the cyclical component is extracted from the detrended data, Finally, the residue can be decomposed into the seasonal and the irregular components. Within this context, the manner in which any component is defined and how it is extracted are liable to affect the definitions of all of the other components. In particular, variations in the definition of the trend will have substantial effects upon the representation of the business cycle. It has been the contention of several authors, including Harvey and Jaeger (1993) and Cogley and Nason (1995), that the effect of using the Hodrick– Prescott filter to extract a trend from the data is to create or induce spurious cycles in the complementary component, which includes the cyclical component. Others have declared that such an outcome is impossible. They point to the fact that, since their gains never exceeds unity, the filters cannot introduce 31 D.S.G. POLLOCK: Filtering Macroeconomic Data 1.25 A 1 B 0.75 0.5 0.25 C 0 0 π/4 π/2 3π/4 π Figure 12. The pseudo-spectrum of a random walk, labelled A, together with the squared gain of the highpass Hodrick–Prescott filter with a smoothing parameter of λ = 100, labelled B . The curve labelled C represents the spectrum of the filtered process. anything into the data, nor can they amplify anything that is already present. On this basis, it can be fairly asserted that, at least, the verbs to create and to induce have been miss-applied, and that the use of the adjective spurious is doubtful. The analyses of Harvey and Jaeger and of Cogley and Nason have both depicted the effects of applying the Hodrick–Prescott filter to a theoretical random walk that is supported on a doubly-infinite set of integers. They show that the spectral density function of the filtered process possesses a peak in the low-frequency region that is based on a broad range of frequencies. This seems to suggest that there is cyclicality in the processed data, whereas the original random walk has no central tendency. This analysis is illustrated in Figure 12. The curve labelled A is the pseudo spectrum of a first-order random walk. The curve labelled B is the squared modulus of the frequency response of the highpass, detrending, filter with a smoothing parameter of 100. The curve labelled C is the spectral density function of a detrended sequence which, in theory, would be derived by applying the filter to the random walk. The fault of the Hodrick–Prescott filter may be that it allows elements of the data at certain frequencies to be transmitted when, ideally, they should be blocked. However, it seems that an analysis based on a doubly-infinite random walk is of doubtful validity. The effects that are depicted in Figure 12 are due largely to the unbounded nature of the pseudo spectrum labelled A, and, as we have already declared, there is a zero probability that, at any given time, the value generated by the random walk will fall within a finite distance of the horizontal axis. An alternative analysis of the filter can be achieved by examining the effects of its finite-sample version upon a finite and bounded sequence that has 32 D.S.G. POLLOCK: Filtering Macroeconomic Data 0.15 0.1 0.05 0 −0.05 −0.1 0 50 100 150 Figure 13. The residual sequence obtained by extracting a linear trend from the logarithmic consumption data, together with a low-frequency trajectory that has been obtained via the lowpass Hodrick–Prescott filter. 11.5 11 10.5 10 0 50 100 150 Figure 14. the quarterly logarithmic consumption data together with a trend interpolated by the lowpass Hodrick–Prescott filter with the smoothing parameter set to λ = 1, 600. 0.08 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08 0 50 100 150 Figure 15. The residual sequence obtained by using the Hodrick–Prescott filter to extract the trend, together with a fluctuating component obtained by subjecting the sequence to a lowpass frequency-domain filter with a cut-off point at π/8 radians. 33 D.S.G. POLLOCK: Filtering Macroeconomic Data been detrended by the interpolation of a linear regression function, according to the ordinary least-squares criterion. If y is the vector of the data and if PQ = Q(Q Q)−1 Q , where Q is the second-order difference operator, then the vector of the ordinates of the linear regression is (I − PQ )y , and the detrended vector is the residual vector e = PQ y . The highpass Hodrick–Prescott filter ΨH = Q(λ−1 I + Q Q)−1 Q will generate the same output from the linearly detrended data as from the original data. Thus, it follows from (94) that ΨH y = ΨH e. In characterising the effects of the filter, it is reasonable to compare the linearly detrended data e = PQ y with the output ΨH y of the filter. In the case of the logarithmic consumption data, these sequences are represented by the jagged lines that are, respectively, the backdrops to Figures 13 and 15. Superimposed upon the residual sequence e = PQ y of Figure 13 is the low-frequency trajectory (I − ΨH )PQ y = (I − ΨH )e that has been obtained by subjecting e to the lowpass Hodrick–Prescott Filter with a smoothing parameter of 1,600. Figure 14 shows the quarterly logarithmic consumption data together with a trend x = (I − ΨH )y interpolated by the lowpass Hodrick–Prescott filter. This trend can be obtained by adding the smooth trajectory of (I − ΨH )e of Figure 13 to the linear trend (I − PQ )y . That is to say, (I − ΨH )y = (I − ΨH ){PQ y + (I − PQ )y } = (I − ΨH )e + (I − PQ )y, (118) which follows since (I − ΨH )(I − PQ ) = (I − PQ ). (An implication of this identity is that a linear trend will be preserved by the lowpass H–P filter.) Superimposed upon the jagged sequence ΨH e of Figure 15 is the smoothed sequence Ψ◦ ΨH e, where Ψ◦ is the lowpass frequency-domain filter with a cut-off ξ ξ at π/8 radians, which is the value that has been determined from the inspection of the periodogram of Figure 10. Now, a comparison can be made of the smooth trajectory Ψ◦ e = Ψ◦ PQ y ξ ξ of Figure 11, which has been determined via linear detrending, and which has been regarded as an appropriate representation of the business cycle, with the trajectory x◦ = Ψ◦ ΨH y of Figure 15, which has been determined via the ξ Hodrick–Prescott filter. Whereas the same essential fluctuations are present in both trajectories, it is apparent that the more flexible detrending of the Hodrick–Prescott filter has served to reduce and to regularise their amplitudes. Thus, some proportion of the fluctutions, which ought to be present in the trajectory of the business cycle, has been transferred into the trend. Thus, although it cannot be be said that the Hodrick–Prescott filter induces spurious fluctuations in the filtered sequence, it is true that it enhances the regularity of some the fluctuations that are present in the data. However the same can be said, without exception, of any frequency selective filter. A prescription for the estimating the trend is that it should it be maximally stiff, unless it is required to accommodate a structural break. The trend is to 34 D.S.G. POLLOCK: Filtering Macroeconomic Data 13 12 11 10 1880 1900 1920 1940 1960 1980 2000 Figure 16. The logarithms of annual U.K. real GDP from 1873 to 2001 with an interpolated trend. The trend is estimated via a filter with a variable smoothing parameter. be regarded as a benchmark with which to measure the cyclical fluctuations. In times of normal economic activity, a log linear trend, which represents a trajectory of constant exponential growth, may be appropriate. At other times the trend should be allowed to adapt to reflect untoward events. A device that achieves this is available in the form of a version of the H–P filter that has a smoothing parameter that is variable over the sample. When the trajectory of the trend is required to accommodate a structural break, the smoothing parameter λ can be set to a value close to zero within the appropriate locality. Elsewhere, it can be given a high value to ensure that a stiff curve is created. Such a filter is available in the IDEOLOG computer program, of which the web address will be given at the end of the chapter. Figure 16 shown an example of the use of this filter. There were brief disruptions to the steady upwards progress of GDP in the U.K. after the two world wars. These breaks have been absorbed into the trend by reducing the value of the smoothing parameter in their localities, which are highlighted in the figure. By contrast, the break that is evident in the data following the year 1929 has not been accommodated in the trend. Seasonal Adjustment in the Time Domain The seasonal adjustment of economic data is performed preponderantly by central statistical agencies. The prevalent methods continue to be those that were developed by the U.S. Bureau of Census and which are encapsulated in the X-11 computer program and its derivatives X-11-ARIMA and X-12. The X-11 program was the culmination of the pioneering work of Julius Shiskin in the 1960’s. (See Shiskin et. al. 1967.) The X-11 program, which is difficult to describe concisely, depends on the successive application of the time-honoured Henderson moving-average filters that have proved to be very effective in practice but which lack a firm foundation 35 D.S.G. POLLOCK: Filtering Macroeconomic Data in the modern theory of filtering. An extensive description of the program has been provided by Ladiry and Quenneville (2001). Recently, some alternative methods of seasonal adjustment have been making headway amongst central statistical agencies. Foremost amongst these is the ARIMA-model-based method of the TRAMO–SEATS package. Within this program, the TRAMO (Time Series Regression with ARIMA Noise, Missing Observations and Outliers) module estimates a model of the composite process. Thereafter, the estimated parameters are taken to be the true parameter of the process, and they are passed to the SEATS (Signal Extraction in ARIMA Time Series) module, which extracts the components of the data. The program employs the airline passenger model of Box and Jenkins (1976) as its default model. This is represented by the equation (119) y (z ) = N (z ) ε(z ) = P (z ) (1 − ρz )(1 − θz s ) (1 − z )(1 − z s ) ε(z ), where N (z ) and P (z ) are polynomial operators and y (z ) and ε(z ) are, respectively, the z -transforms of the output sequence y (t) = {yt ; t = 0, ±1, ±2, . . .} and of the input sequence ε(t) = {εt ; t = 0, ±1, ±2, . . .} of unobservable whitenoise disturbances. The integer s stands for the number of periods in the year, which are s = 4 for quarterly data and s = 12 for monthly data. Without loss of generality as far as the derivation of the filters is concerned, the variance of the input sequence can be set to unity. Given the identity 1 − z s = (1 − z )Σ(z ), where Σ(z ) = 1 + z + · · · + z s−1 is the seasonal summation operator, it follows that (120) P (z ) = (1 − z )(1 − z s ) = ∇2 (z )Σ(z ), where ∇(z ) = 1 − z is the backward difference operator. The polynomial Σ(z ) has zeros at the points exp{i(2π/s)j }; j = 1, 2, . . . , s − 1, which are located on the circumference of the unit circle in the complex plane at angles from the horizontal that correspond to the fundamental seasonal frequency ωs = 2π/s and its harmonics. The TRAMO–SEATS program effects a decomposition of the data into a seasonal component and a non-seasonal component that are described by statistically independent processes driven by separate white-noise forcing functions. It espouses the principal of canonical decompositions that has been expounded by Hillmer and Tiao (1982). The first step in this decomposition entails the following partial-fraction decomposition of the generating function of the autocovariances of y (t): (121) U ∗ (z −1 )U ∗ (z ) V ∗ (z −1 )V ∗ (z ) N (z −1 )N (z ) = 2 −1 2 + + ρθ. P (z −1 )P (z ) ∇ (z )∇ (z ) Σ(z −1 )Σ(z ) Here, ρθ is the quotient of the division of N (z −1 )N (z ) by P (z −1 )P (z ), which must occur before the remainder, which will be a proper fraction, can be decomposed. 36 D.S.G. POLLOCK: Filtering Macroeconomic Data In the preliminary decomposition of (121), the first term on the RHS corresponds to the trend component, the second term corresponds to the seasonal component and the third term corresponds to the irregular component. Hillmer and Tiao have provided expressions for the numerators of the RHS, which are somewhat complicated, albeit that the numerators can also be found by numerical means. When z = eiω , equation (121) provides the spectral ordinates of the process and of its components at the frequency value of ω . The corresponding spectral density functions are obtained by letting ω run from 0 to π . The quotient ρθ corresponds to the spectrum a white-noise process, which is constant over the frequency range. The principal of canonical decomposition proposes that the estimates of the trend and of the seasonal component should be devoid of any elements of white noise. Therefore, their spectra must be zero-valued at some point in the interval [0, π ]. Let qT and qS be the minima of the spectral density functions associated with the trend and the seasonal components respectively. By subtracting these values from their respective components, a revised decomposition is obtained that fulfils the canonical principal. This is (122) U (z −1 )U (z ) V (z −1 )V (z ) N (z −1 )N (z ) = 2 −1 2 + + q, P (z −1 )P (z ) ∇ (z )∇ (z ) Σ(z −1 )Σ(z ) where q = ρθ + qT + qS . The Wiener–Kolmogorov principle of signal extraction indicates that the filter that serves to extract the trend from the data sequence y (t) should take the form of U (z −1 )U (z ) P (z −1 )P (z ) × ∇2 (z −1 )∇2 (z ) N (z −1 )N (z ) U (z −1 )U (z ) × Σ(z −1 )Σ(z ). = −1 )N (z ) N (z βT (z ) = (123) This is the ratio of the autocovariance generating function of the trend component to that of the process as a whole. This filter nullifies the seasonal component in the process of extracting a trend that is relatively free of highfrequency elements. The nullification of the seasonal component is due to the factor Σ(z ). The squared gain of the filter that serves to extract the trend from the quarterly logarithmic consumption data of Figure 9 is shown in Figure 17. This filter is derived from a model of the data based on equation (120), where s = 4 and where ρ = 0.1698 and θ = 0.6248 are estimated parameters that determine the polynomial N (z ). The estimated trend is shown in Figure 17. The filter that serves to extract the seasonal component from the data is constructed on the same principal as the trend extraction filter. It takes the form of (124) βS (z ) = V (z −1 )V (z ) × ∇2 (z −1 )∇2 (z ). N (z −1 )N (z ) 37 D.S.G. POLLOCK: Filtering Macroeconomic Data 1 0.75 0.5 0.25 0 0 π/4 π/2 3π/4 π Figure 17. The squared gain of the filter for extracting the trend from the logarithmic consumption data. The filter that serves the purposes of seasonal adjustment, and which nullifies the seasonal component without further attenuating the high-frequency elements of the data, is (125) βA (z ) = 1 − βS (z ). The squared gain of the seasonal adjustment filter that is derived from the model of the logarithmic consumption data is in shown in Figure 19 and the seasonal component that is extracted from the data is shown in Figure 20. Various procedures are available for effecting the canonical decomposition of the data. The method that is followed by the SEATS program is one that was expounded in a paper of Burman (1980), which depends on a partial fraction decomposition of the filter itself. The decomposition of the generic filter takes the form of (126) C (z ) D(z ) D(z −1 ) β (z ) = = + . N (z )N (z −1 ) N (z ) N (z −1 ) Compared with the previous approaches associated with the time-domain filters, this a matter of implementing the filter via components that are joined in parallel rather than in series. The estimate of the seasonal component obtained by Burman’s method is therefore (127) x(z ) = f (z ) + b(z ) = D(z ) D(z −1 ) y (z ) + y (z ). N (z ) N (z −1 ) Thus, a component f (t) is obtained by running forwards through the data, and a component b(t) is obtained by running backwards through the data. 38 D.S.G. POLLOCK: Filtering Macroeconomic Data 11.5 11 10.5 10 0 50 100 150 Figure 18. The logarithmic consumption data overlaid by the estimated trend-cycle component. The plot of the seasonally-adjusted data, which should adhere closely to the trend-cycle trajectory, has been displaced downwards. 1 0.75 0.5 0.25 0 0 π/4 π/2 3π/4 π Figure 19. The squared gain of the seasonal adjustment filter derived from a model of the logarithmic consumption data. 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 0 50 100 150 Figure 20. The component that is removed by the seasonal adjustment filter. 39 D.S.G. POLLOCK: Filtering Macroeconomic Data In order to compute either of these components, one needs some initial conditions. Consider the recursion running backwards through the data, which is associated with the equation (128) N (z −1 )b(z ) = D(z −1 )y (z ). This requires some starting values for both b(t) and y (t). The SEATS program obtains these values by stepping outside the sample. The post-sample values of y (t) are generated in the usual way using a recursion based upon the equation of the ARIMA model, which is (129) ψ (L)y (t) = N (L)ε(t). Here, the requisite post-sample elements of ε(t) are represented by their zerovalued expectations. The post-sample values of b(t) are calculated by a clever algorithm which was proposed to Burman by Granville Tunnicliffe–Wilson. (Tunnicliffe–Wilson was responsible for writing the programs that accompanied the original edition of the book of Box and Jenkins (1976); and he has played a major role in the development of the computational algorithms of modern time-series analysis.) The Burman–Wilson algorithm is expounded in the appendix to Burman’s paper. To initiate the recursion which generates the sequence f (t), some presample values are found by a method analogous to the one that finds the postsample values. Seasonal Adjustment in the Frequency Domain The TRAMO–SEATS program generates an abundance of diagrams relating to the spectra or pseudo-spectra of the component models and to the frequency responses of the associated filters. These diagrams are amongst the end products of the analysis. However, there is no frequency analysis of the data to guide the specification of the filters. Instead, they are determined by the component models that are derived from the aggregate ARIMA model that describes the data. In this section, we shall pursue a method of seasonal adjustment that begins by looking at the periodogram of the detrended data. The detrending is by means of a polynomial regression. The residual sequence from the linear detrending of the logarithmic consumption data is shown in Figure 11 and the corresponding periodogram is shown in Figures 10 and 21. Figure 22 shows that the significant elements of the data fall within three highlighted bands. The first band, which covers the frequency interval [0, π/8], comprises the elements that constitute the low-frequency business cycle that is represented by the heavy line in Figure 11. When the cycle is added to the linear trend that is represented in Figure 9, the result is the trend–cycle component that is shown in Figure 21. The second highlighted band, which covers the interval [π/2 − 4π/T, π/2 − 4π/T ], comprises five elements, which include two on either side of the seasonal 40 D.S.G. POLLOCK: Filtering Macroeconomic Data 11.5 11 10.5 10 0 50 100 150 Figure 21. The trend-cycle component derived by adding the interpolated polynomial to the low-frequency components of the residual sequence 0.01 0.0075 0.005 0.0025 0 0 π/4 π/2 3π/4 π Figure 22. The periodogram of the residual sequence obtained from the linear detrending of the logarithmic consumption data. The shaded bands in the vicinities of π/2 and π contain the elements of the seasonal component. 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 0 50 100 150 Figure 23. The seasonal component, synthesised from Fourier ordinates in the vicinities of the seasonal frequency and its harmonic. 41 D.S.G. POLLOCK: Filtering Macroeconomic Data frequency of π/2. The third band, which covers the interval [π − 6π/T, π ], contains the harmonic of the seasonal frequency and three elements at adjacent frequencies. The seasonal component, which is synthesised from the elements in the second and third bands, is represented in Figure 23. In addition to showing the logarithmic data sequence and the interpolated trend–cycle component, Figure 21 also shows a version of the seasonallyadjusted data. This is represented by the line that has been displaced downwards. It has been derived by subtracting the seasonal component from the data. A comparison of Figure 17–20, which relate to the ARIMA-model-based filters, with Figures 21–23, which relate to the frequency-domain filters, shows that, notwithstanding the marked differences in the alternative methodologies of filtering, the results are virtually indistinguishable. This is a fortuitous circumstance that is largely attributable to the nature of the data, which is revealed by the periodogram. On the strength of what is revealed by Figure 22, it can be asserted that an ARIMA model misrepresents the data. The components of the detrended data are confined to bands that are separated by wide dead spaces in which there are no elements of any significant amplitudes. In contrast, the data generated by an ARIMA process is bound to extend, without breaks, over the entire frequency interval [0, π ], and there will be no dead spaces. The nature of an ARIMA process is reflected in the gain of the trendextraction filter of the TRAMO–SEATS program, which is represented by Figure 17. The filter allows the estimated trend to contain elements at all frequencies, albeit that those at the highest frequencies are strongly attenuated. This accords with the model of the trend, which is random walk. Disregarding the seasonal component, there are no high-frequency elements in the data, nor any beyond the frequency limit of π/8. Therefore, there is no consequence in allowing such elements to pass through the filter; and its effects are virtually the same as those of the corresponding frequency-domian filter. If there were anything in the data beyond the limit that had not been removed by the seasonal adjustment, then the effect of the filter would to be produce a trend–cycle component with a profile roughened by the inclusion of high-frequency noise. It would resemble a slightly smoother version of the seasonally-adjusted data sequence. The Programs The programs that have been described in this chapter are freely available from various sources. The H–P (Leser) filter and the Butterworth filter have been implemented in the program IDEOLOG, as have the frequency-domain filters. The program is available at the address http://www.le.ac.uk/users/dsgp1/ The H–P and Butterworth filters are also available in the gretl (Gnu Regression, Econometrics and Time-series Library) program, which can be downloaded 42 D.S.G. POLLOCK: Filtering Macroeconomic Data from the address http://gretl.sourceforge.net/ The TRAMO–SEATS program which implements the ARIMA-model-based filters is available from the Bank of Spain at the address http://www.bde.es/webbde/en/secciones/servicio/software/programas.html The program, which is free-standing, can also be hosted by gretl. References Box, G.E.P., and G.M. Jenkins, (1976), Time Series Analysis: Forecasting and Control, Revised Edition, Holden Day, San Francisco. Burman, J.P., (1980), Seasonal Adjustment by Signal Extraction, Journal of the Royal Statistical Society, Series A, 143, 321–337. Caporello G., and A. Maravall, (2004), Program TSW, Revised Reference Manual, Servicio de Estudios, Banco de Espa˜a. n Cogley, T., and J.M. Nason, (1995), Effects of the Hodrick–Prescott Filter on Trend and Difference Stationary Time Series, Implications for Business Cycle Research, Journal of Economic Dynamics and Control, 19, 253–278. Gray, R.M., (2002), Toeplitz and Circulant Matrices: A Review, Information Systems Laboratory, Department of Electrical Engineering, Stanford University, California, http://ee.stanford.edu/gray/~toeplitz.pdf. Hall, H.S., and S.R. Knight, (1899), Higher Algebra, Macmillan and Co., London. Harvey, A.C., and A. Jaeger, (1993), Detrending, Stylised Facts and the Business Cycle, Journal of Applied Econometrics, 8, 231–247. Hodrick, R.J., and E.C. Prescott, (1980), Postwar U.S. Business Cycles: An Empirical Investigation, Working Paper, Carnegie–Mellon University, Pittsburgh, Pennsylvania. Hillmer, S.C., and G.C. Tiao, (1982), An ARIMA-Model-Based Approach to Seasonal Adjustment, Journal of the American Statistical Association, 77, 63– 70. Hodrick R.J., and E.C. Prescott, (1997), Postwar U.S. Business Cycles: An Empirical Investigation, Journal of Money, Credit and Banking, 29, 1–16. Jolly, L.B.W., (1961), Summation of Series: Second Revised Edition, Dover Publications: New York. Jury, E.I., (1964), Theory and Applications of the z-Transform Method, John Wiley and Sons, New York. 43 D.S.G. POLLOCK: Filtering Macroeconomic Data Kolmogorov, A.N., (1941), Interpolation and Extrapolation, Bulletin de l’Academie des Sciences de U.S.S.R., Ser. Math., 5, 3–14. Ladiray, D., and B. Quenneville, (2001), Seasonal Adjustment with the X-11 Method, Springer Lecture Notes in Statistics 158, Springer Verlag, Berlin. Leser, C.E.V., (1961), A Simple Method of Trend Construction, Journal of the Royal Statistical Society, Series B, 23, 91–107. Nyquist, H., (1928), Certain Topics in Telegraph Transmission Theory, AIEE Transactions, Series B, 617–644. Pollock, D.S.G., (1999), A Handbook of Time-Series Analysis, Signal Processing and Dynamics, Academic Press, London. Pollock, D.S.G., (2000), Trend Estimation and Detrending via Rational Square Wave Filters, Journal of Econometrics, 99, 317–334. Pollock, D.S.G., (2002), Circulant Matrices and Time-Series Analysis, The International Journal of Mathematical Education in Science and Technology, 33, 213–230. Pollock, D.S.G., (2009), Realisations of Finite-sample Frequency-selective Filters, Journal of Statistical Planning and Inference, 139, 1541–1558. Shannon, C.E., (1949a), Communication in the Presence of Noise, Proceedings of the Institute of Radio Engineers, 37, 10–21. Reprinted in 1998, Proceedings of the IEEE, 86, 447–457. Shannon, C.E., (1949b), (reprinted 1998), The Mathematical Theory of Communication, University of Illinois Press, Urbana, Illinois. Shiskin, J., A.H. Young, and J.C. Musgrave, (1967), The X-11 Variant of the Census Method II Seasonal Adjustment, Technical Paper No. 15, Bureau of the Census, U.S. Department of Commerce. Whittaker, E.T., (1923), On a New Method of Graduations, Proceedings of the Edinburgh Mathematical Society, 41, 63–75. Wiener, N., (1941), Extrapolation, Interpolation and Smoothing of Stationary Time Series. Report on the Services Research Project DIC-6037. Published in book form in 1949 by MIT Technology Press and John Wiley and Sons, New York. 44 ...
View Full Document

This note was uploaded on 03/02/2012 for the course EC 7087 taught by Professor D.s.g.pollock during the Fall '11 term at Queen Mary, University of London.

Ask a homework question - tutors are online