This preview shows pages 1–22. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: AUTOREGRESSIVE MODELING OF THE EEG
An Introduction to Discrete Time Frequency Analysis with the EEG as an Example The EEG signal is known to contain components with different frequency content. Perhaps the
most widely known example is that successive stages of sleep can be characterized by changes in
frequency content, including alpha waves (8 to 12 Hz), delta waves (2 to 6 Hz), and others. The intent
of this section is to review in cursory fashion some of the basics of sampled data systems and the
application of the discrete Fourier transform (DFT), which is better known as the FFT (Fast Fourier Transform), the most common implementation of the DFT. Below are the continuous time definitions of the Fourier and Laplace transforms and of the
inverse Fourier transform, which should be compared to the discrete time and frequency versions below. Fourier Transform F0 0)) = f e’lwt f(t) dt
Laplace Transform F(s) = e‘St f(t) dt
inverse Fourier Transform f(t) = % f e30“ F0 (0) duo In the discrete time / sampled data version, similar formulae are developed. Let the following
parameters be defined to help describe the transformations. N = # of samples tS = sample interval
n indexes time samples t = n  tS rn indexes frequency samples (i)S = m ° Aws
r,=1/t, Aws=ft(2“/N) Note that the smallest frequency Aws (or, in Hz, Af, ) gets smaller the larger the number N of samples. Also, note that the frequencies may be described in radians per second, radians per sample, cycles per
second (Hz) or cyles per sample. (a) At any frequency 035 we can compute a transform from the discrete time samples. Note that the frequency can be any frequency, not just integer submulitples of the sampling frequency.
N—l F(ejws) = 2 e—stms ft (ms) n=O
(b) The DFT is computed only at selected frequencies, which are integer submultiples of the sampling frequency, .Aws = 27‘ m / N 27tm N 1 27tmn F(m)=F(e*j N )= —e'j N for)
n=0 (c) Likewise the inverse DFT uses only those selected frequency components to reassemble the sampled
data signal in the time domain. 27tmn N—i A
rm): = Xeﬂ N F(m)
=0 IH.2 Interpretation of Discrete Time Transforms — Examples
It is very useful to be able to “sight read” a Fourier Transform. On this page are three basic transforms: the simple exponential, which is the building block of the mathematics, and building blocks of the real
signals — the cosine and the sine functions, which when combined, give magnitude and phase
information. f(n) = elm 5" (discrete time) or f(t) = ej‘0t (continuous time).  a complex function
Helms) = l for (1)S = (/55
0 otherwise ﬁ______l_.J_I——————A (DC
A (DC . A ./\
p A 61an + e—stn
f(n) = COS (0)5 n) = ———— 2
1/2 1/2
even function
conjugate symmetric
__A A /\
035 H05 phase = 0° at +035
_' lC/bs“ ’ ’lésn
f(n) = sin (clusn) = LitL
+j/2
odd function conjugate symmetric
/\
phase = —90° at HOS 111.3 General Observation For real functions, the Fourier Transform is conjugate symmetric. Thus, you don't need to look
at negative frequencies and most often one plots only the real and imaginary parts for positive
frequencies. If one plots magnitude and phase, then the plots are identical to Bode plots in form. f(n) = cos (63511 + (1)) Real COS (D Imag +j sin (I) 2
(both could be inverted in sign)
Mag Phase
1/2 1/2 + (l)
 <1) FFT The Fast Fourier Transform FFT is an efficient method for computing the DFT. It’s most
popular implementation requires that N be a power of two, such as 32, 64, 128, The efficiency is
due to the symmetry of the angles involved in the computation  if you look at the formulae below in
detail you will see that the same cosines and sines of the same sets of angles are used over and over
again. For instance, if N28, then one needs the cosines and sines of O, 45, 90, 135, 180, 225, 270, and
315 degrees. You‘ll quickly recognize that the cosines of 45 and 315 are the same, while the sines are
the negatives of each other. This symmetry can be taken advantage of so that the number of computations in the straightforward DFT is proportional to N2, while for the FFT it is N log 2N. It turns
out that there are efficiencies to be gained for other lengths, as long as the length is not a prime number. N—l N—l
FFT formulae: X(m) = 2 WE“ X(n) X(m) = fit" 2 WI—qmn X(m) Wann : exp [1. 2n
"=0 n=0 A quirk of the algorithm is that it implicitly assumes that the sample sequence repeats, as it illustrated
below. This can cause artifacts, in that the FFT algorithm acts to fit the sequence exactly with a series of
sinusoids — it must use high frequency sinusoids to match rapidly changing parts of the signal. 1H4 fn
(>01234 becomes
I T TllT Note sharp transition as the ends are
stitched together. This causes
artifacts as the FFT implicitly “fits” This problem is attacked in several ways: (1) Padding with zeroes, as shown below (2) Starting with long sequences, so the errors are small (3) Using a window function —— imagine a multiplying the original sequence by a trapezoid which
equals one in the middle (i.e. it uses the original values) and which tapers to zero near the ends
(it smoothes the "ragged" ends of the sequence down to zero. The window functions used have various names: trapezoid, Blackman, Hamming, Hanning, and are studied in courses like
ECE 310 and 451. f(n) original
sequence zeroes
added One result of zero padding is that the Fourier transform is calculated at twice as many frequencies —— the
new ones are essentially values interpolated between the original frequencies. I115 Aliasing: One effect of the phenomenon known as aliasing is that the “negative” frequencies can be
N N plotted as if the corresponded to indices between —2— and N, and not those between —7 and N. This is done for convenience, because the numerical algorithms in most computer languages are easier if the indexing is done from O to Nl, or, as in Matlab, from 1 to N. One example is taken from wagon wheels in western movies. If one spoke of the wheel moves
more than one half of the way to the next position in each frame of the movie, you will perceive the
wheel to be moving backwards, not forwards. The maximum positive rate the wheel can be perceived as
moving is 1/2 position per picture frame —— the Nyquist rate. Thus, aliasing is where one frequency
"masquerades" as another, because it the two are indistinguishable if all you have are the sample values. Note the following aliasing effects: * if the spoke moves less than 1/2 a position per frame it is seen has having an ordinary, positive rotational frequency.
* if the spoke moves more than 1/2 but less than one position per frame, it is seen as moving backwards — a "negative" rotational frequency.
* if the spoke moves more than 1 but less than 1.5 positions per frame, it is seen as moving
forwards at a slower rate. I.e. if it moved 1.2 positions per frame, it is perceived as moving
0.2 positions per frame.
* if the spoke moves 2.2 positions per frame, it is still perceived as moving 0.2 positions per
frame.
The special case which affects programming and plotting of the FFT results is the case where negative
frequencies are routinely plotted with positive indices between N/2 and N. The following algebra attempts to supplement the wagon wheel explanation, by showing that a complex exponential at (ozm is
identical to one at u)=Nm. The frequency N/2 cyles/sample is known as the "folding" or "mirror" frequency.
. N — m .27: _
exp and exp [J £11m] 7
\____‘,_.__./
.271; m . 27E N
exp —} N ° exp J N
v W—J
 21: m . 27: m
mirror
' l
m => N _ m conjugate
symmetric
—N/2 0 N/2 0 N/2
\V/ 111.6 Below are some transform pairs for practice Frequency Domain Time Domain
(real) ‘ Peat Imaginary s
4
2 yo 5
s constant
a ' V 5 1O 15
Frequency Domain
(real)
Real imaginary
1
0,5
0 exp(21In/N')
O 5
"o 5 10 1 5
Frequency Domain Time Domain
(real)
.5 _ Real Imaginary
6 exp(~27cn/N)
Frequency Domain Time Domain
(real)
3 Real 1 Imaglnary
J cos(27tn/N) The composition of a sine wave from its complex exponential parts. Real
1
O 5
0
—0.5
1
0 0‘5 1
Magnimde 0 0,5 1
Magmtude Real qt! IX)!!! IXKXI IXXK‘ O 0.5 1
Magmlude
1o_____~_.__,____...___
l X
5’ 1 005(3 on + 45) Real 0.5
Magnitude Imaginary imagmary
20 10 1maginary 1O 0‘ XXX! 2*)!!! l’lxk 3‘
'10 phase
100 0* XXX! [*XKI [xxx 1OO Imaginary
1O Real
1
o 5
o
o,5
‘1 o 10 15
Rem
1
o s
n
o 5
4 1 l
Real
1
o 5
o
70 5
"o 5 1 o 1 5
Real H17 lmaglnary
1
0.5
o
.05
‘1 n 5 1o 15
imagmary
1
0.5
O
‘0 5
1
O 1 15
Imagmary
1
D 5
O
O 5
1
D 5 1O 1 5
lmagmary 05 o ......a.ao..o O 5 m 15 111.8 Time Domain Frequency Domain
cos(2m(3.5/64)) (3.5 cycles/64 samples) Real Imaginary
2
t I 1 —1 1
F‘ “r T “r
x " I l
I u
x I
o 5 x * x 0
x
x x " 0 0.5 1 .
x ‘ I Magnitude Phase
0 " .. x 100
l
r " _
I I u
.0 5 I I " I O
I I I
at " I .
 I ' ‘ a" u 100 i‘
10 1’5 20 '50 4o 50 60 7o 0 0.5 1
Same Information, Different Style Plots
Time Data Log/Log Magnitude
1 40
m
06 E 30
'3)
0 » 3 20
'E
D?
O.5 g 10 w
.1 O
O 50 100 2 1 O
Iog10 treqency)
Magnitude Phase
50 100 ._.————————~——
I .
50 . V , . , , V , , , , . , . , , , r 1 O ‘ —100
O 0.5 0 Zero padding causes interpolation of the power spectrum and a lot of spurious estimates at frequencies
where the power is quite low. The original signal was exactly 3.5 cyles per 64 samples, which fell in
between two of the data points at which the FFT is computed. It is now quite well represented at 7
cycles per 128 samples. Ttme Data Log/Log Magnitude m U E a; D 2 .E , U) m E . ~60 ‘ —=—‘
3 —2 t O
Iog‘lO treqency) Magnitude phi59 80 .__..—————~——— 100 ~1oo _._____.____+__i
0 These two functions plot identically:
cosine(2 7: t (31/64)) a frequency just below the Nyquist rate. cosine(2 71; t (33/64)) a frequency just above the Nyquist rate. Time Data Log/Log Magnitude
m
‘0
E
O.)
'C
2
.E _
CD
(6
E . 1 60 ‘
O 50 100 2 1 O
logtO freqency)
Magnitude Phase 80 —————————y—— I119 111.10 SPECTRAL ANALYSIS “FFT” Approach to Spectral Analysis. Signal processors commonly compute the Power Spectrum,
which is closely related to the FFT. Here are some descriptions of related functions which we can carry out with Matlab. let: y(h) n = l, N (Matlab indexes from 1, not from O)
Y = fft(x)
Y is an N element complex vector. We may wish to compute
Ymag = abs (Y); th 2 angle (Y); Yreal=real(Y); Yimag:imag(Y); The Power Spectrum is defined as the Fourier Transform of the Autocorrelation Function
given a time series y(k)or a continuous function y(t) M—l
then Ryy (m) : if 2 y(k) y(l< + m)
k=0
l
or Ryy (t) = T fyﬁ) YO + T) Note that the value at a lag of zero has a special value, because it equals the variance of the process.
RyV (O) = o2 The Autocorrelation Function emphasizes the periodicity of a signal
Rny (I) : large for sinusoids (or any repetitive waveform) of period I The Power Spectrum is the transform of the autocorrelation function SW = HRH") IH.ll Matlab Example Generating a noisy sinusoid for test, which has 20] elements in it. t = 0:200; % 201 elements
y = sin(2 * pi * 25 * t/200) + 2 * sin(2 * pi * 60 * t/200); yn = y + 0.5 >x< randn (size (0); y=sin(2*pi’50*t)+2‘sin(2*pi*120‘0 3*——1—————1—————1——T————
2*— 
1 1
O 
E
4 l i l d
.2 a
3.____...._.___l___.—J__..._.._J—___.—L———_—— O 50 100 150 200 250 Direct autocorrelation: Ryy = xcorr (yn) . The result is a 403 element vector, which has two problems which make the interpretation hard: Ryy = xcorr (yn) (biased) 400— _
200— 1 0 ll 5 i ‘ .l
2oop ; I — (a) the way Matlab does indexing makes the zero lag element show up at index 202, half way
through the plot. This is easy to correct for plotting. 111.12 (b) the "ends" of the function don't get as much to count! Whereas there are 201 pairs of
values, {x(t), x(t)}, and 200 pairs of values {x(_t), x(t+l)}, there is only one pair of values
{x(t), x(t+200)}. This means the value of Ryy(200) is going to be very small. The way
to correct this problem is to eliminate the "bias", by dividing by the number of pairs of
values used to compute the autocorrelation —— i.e. by 201 for RyytO), by 200 for RWU) and by 1 for Ryytzoo). Ryy : xcorr (yn.‘unbiased') 3r—ﬁﬁéﬁ—ﬁr‘ﬁ—ﬁ—rﬁﬂ
2 — i .4
l— r l 1
l 1
l x
i I t i
0 I 1 1 l l 1 I 1 a
" ,1 l
1 H
i A 1 i i A
1 l “ 1
2 I .,
.3 t  1 _______ 1_.._L.. .L_._._
0 50 100 150 200 250 300 350 400 450 The Power Spectrum Syy is readily computed from the unbiased estimate of the autocorrelation
functions Ryy. It shows off nicely, the two components of the original signal  one at 50 cycles per 403 samples, or 25 cylcles per 201 samples; and another, twice as big in amplitude but four times as large in
power, at 60 cyles per 201 samples. This is an artificial example, but it is exactly what one hopes for: a
calculation unambiguously showing what are the principal sinusoids in a set of data. Ull3 Syy = fft(Ryy) (unbiased) 400 350 300 250 200 150 100' 50 O
O 50 100 150 200 250 300 350 400 450 Some of the more common place techniques for computing the power spectrum follow from the
Convolution Theorem in Laplace y(t) = f(t) >!< g(t) <=> F(s)  G(s) = Y(s) in Fourier y(t) = rm * g(t) <=> 130(1)) (30 w) = Yg (0) These suggest the following procedure works for convolving two time domain functions, either discrete
time or continuous time: 1) transform f(t) , g(t)
2) multiply F0 to) , G0 03)
3) inverse transform product y(t) = F—1 (Y(j (0)) Recall the relationship between convolution and correlation: f * g = f f(t) g(T—t) dt convolution
f (>9 g = f f(t) g(‘c+t) dt correlation The transforms of time reversed functions g(t) , g(—t) are related as complex conjugates of each other: F (G(—t)) = F* (g (0) Thus the Transform of the Correlation Function is closely related to the Transform of the Convolution
Function. f* g <=> F(s) G(s) 111.14 f o ¢:> r as g(—t) <=> F(s)G*(s) D Application: Power Spectrum Computation The goal is to compute: Sy), 2 HR”) = F{ y® y } = F(y) F*(y) Which may be done by:
Yn : ffth)
SW = Yn * conj(Yn) / N Note that the power spectrum elements are proportional to the squares of transform magnitudes, e.g., aa* 2 I a 2 for any complex number a I
3),), = N'(ab5 (Ynn A 2 You can also calculate RV}, from Syy — in fact it may be faster than to calculate it directly!l Ryy = ifft( Syy) ; TH.15 Better Estimation of Power Spectrum
There has been much research on this topic. This is appropriate for grad signal processing courses. General Idea: Consider 1024 samples of EEG data, taken at a 100 Hz sample rate, for 10.24 seconds. Assume
the original signal has a bandwidth approximately 20 Hz, with some energy down in the 0.5 Hz
range (2 second period) which we wish to understand. Method 1: Take FFT of 1024 point sequence.
Result: OK, but not great. Errors result from correlations at long lag times, and thus the low frequencies. There are only
124 examples of lags of 900 sample times whereas there are 1023 examples of lags of 1 sample.
As noted above, you can get rid of the bias at long lag times, but you cannot get rid of the fact
that you have only a few samples and your estimates could be wrong. This kind of error affects all frequencies of the power spectrum. Method 11: Same as method 1, except we sample for a longer period of time, 102.4 seconds. Now we
have a 10,000 point FFT, and have increased the processing time substantially. However,
assuming the brain is still acting the same, we'll get better estimates of the correlation function at long lags, and hence a better estimate of the power spectrum at low frequencies.
However, we note that we‘re now computing the correlation function at lags of up to 100 seconds, corresponding to frequencies of 0.01 Hz, which are not of any interest. Method III: Subdivide samples into groups of 1024; compute DFT for each group; average the DFTs.
Result: Better accuracy. We do lose the ability to distinguish frequencies which are either
very low (eg. 0.01 Hz) or which are separated from one another by this amount. Hopefully, this
does not matter. Also, we hope that the brain has not changed its activity in the last 100 seconds. Method IV: Same as 111 except other signal processing techniques added.
Result: better yet. As you might imagine, Matlab will do all these computations for you, using a function spa which is in
the Robust Control Toolbox, and available to you if you use the Engineering College
Workstations, but not on the PCs and Macs. spa uses a group size NG of 30 g NG 3 (total number of samples / 10), unless you specify otherwise (see help spa).
Let y = column vector of samples.
G = Spa(y);
bode(G); % plots the data. Other Notes Parseval’s theorem — conservation of energy 25%2 = 25” “energies in “energies in
time domain” frequency domain” 111.16 Simple Introduction to Discrete Time Systems Modeling
6) y(n) = a0 x(n) + a] x(n—l) + a2 x(n—2) ﬂ Ztransform
Y(Z) = 210 X(Z) + a1 Z‘I x(z) + a2 T2 x(z) 3 compare to Laplace (0— (t+ 95+ 933
y —a0X) aldt a2dt2 II Laplace Transform
Y(s) = a0 x(s) + a1 5 x(s) + a2 s2 x(s) is a simple rule for relating y(t) to input x(t) What happens for sinusoidal inputs? Real Case: x(n) = cos (con)
Complex Case: x(n) = exp (jam) ((1) in radians/sample) x(n) = cos (ton) exp (jean)
x(n—l) = cos (tom—1)) exp (jo)(n—1))
= cos (om—(o) exp ( jwn—j 0)) exp Own) exp ( 10)
exp Own) exp (Jim) cos (ton—203) Back to equation (>9 for x(n) = exp (jam)
y(n> = ao exp (yen) + a1 exp (icon) exp Hm) + a2 exp (icon) exp (—ij)
y(n) = exp (icon) [a0 + 31 exp (—110) + 212 exp (—ZJ’wH
sinusoidal for a given (D, this term is a constant: magnitude plus phase This is the same as the z—transform of ® Y(z) = x(z) [30 + a] 2*] + a2 1'2} if we substitute 2 = exp [jw] Summary:
(1) Write the discrete time 1/0 equation.
(2) Use the ztransform rule.
(3) Simplify expression. (A\ thetitntp 7 ; PYn (i m)
K .1 u u u J . i . u t v u V up \J W). 111.17 For® 2 30+ a1 2—1 + 32 2’2
Y elm . .
(, ) =a0+a1e“lw+a2e‘zlw
X(el‘0) Transfer function almost same as Y(s) or YO 60) XO (0) for continuous functions. There are many variations on this theme: y(n)+b1 y(n—l)+ =x(n)+a1 x(n—l)+ U Ya) 1+a1Z—1+a2z‘2+...
X(Z) — 1+b]Z_]+b22_2+... These polynomials have zeroes and poles which have physical meanings very nearly the same as
continuous time counterparts. Matlab works with them in an analogous fashion. Related Approaches There is a discrete time State Space Approach which can be interconverted to transfer functions. [H llH lH x(t) = Ax(t—l) + B u(t—l) Modeling/Filtering with Discrete Time Systems. In a DSP Course: (1) Develop ztransform more fully. (2) Stability criteria. (3) Develop different classes of filters. (4) Examine frequency response. (5) Continuous ¢:> discrete filter design conversion. (6) Nontime domain, e.g., images. Other Applications
EEG Model, Speech Production. 111.18 EEG Model
White EEG signal
r (correlated;
Home “nonwhite”
(uncorrelated) or "colored") Correlated means the present sample value can be predicted from old sample values, @ y(n) = —a1 y(n~l) + a2 y(n—2) — ~am y(n—m) + e(n)
predicted value noise which is uncorrelated
of y(n) with old values Rewrite ® and take its z—transform: y(n) + a] y(n—l) + + am y(n~m) = e(n) Y(Z) [1+ a1 2—1 + + am z—m] =E(z)
or Y(z) A(z—1) = E(z) where A(Z']) is a polynomial in 2”1 White Noise: Autocorrelation/Spectrum Ree (n) = autocorrelation of noise. Regal) = G2 is the power of the noise signal. Elsewhere Ree(n) : 0. Then See (0)) = F ( R‘se (n)) power spectrum of the noise. __ _l_ constant at
k/ _ N all frequencies 111.19 Autocorrelation function of EEG signal Compute the Power Spectrum of the Signal plus Noise, Ssn = F{R(y(n))}. Use this to fit the
coefficients of the autoregressive function so that the power spectrum is matched as good as possible. This means that the best values of ai minimize the difference between Ssn(cl (0) and Y(el 0)) Y(ej‘°)* (1) Data are taken. (2) Choose ai ’s to fit
Y(ej°3)
Helm) = HAW1) = a0 + aie‘jw) + 2126ij + a3e—l30) + (3) But, by assumption, E(el03) = a constant for all frequencies (4) Thus we have to choose as which fit Y(e30))* Y(ej03) to the power spectrum Ssn(ei 03) of the
data, where Y(ei (D) = a0 + ale—jw) + 2123.ij + a3e.j3w + Alternate Interpretation
3’01): —31 y(n—l) —a2 y(n—2) — + e(n) given: y(n—l) , y(n—2) Choose ai ’s so as to predict y(n).
Best prediction means:
min B { e2(n)} =—: “auto regression” This is a standard and very effective model for EEG analysis. Matlab has a function in the robust control toolbox on the College Workstations, which automatically
computes the coefficients. th 2 ar (y, na) # of coefficients data samples result is in a Matlab standard format present (th); % types an explanation 111.20 Two Ways to Think About This Model white correlated
—) > —l _—)
noise signal
dict
last M , pre
samples _> a me)“
sample The numerical algorithms are the same for estimating the coefficients, no matter which way you choose
to think about the problem. (a) (b) This model has been widely used in other areas of research. Consider, for instance, its use in modeling
of speech. This block diagram is taken from Schafer's text on Digital Signal Processing. Pitch
Period l, Coefficients I I I I (vocal tract parameters) Impulse Train (change ~ 10 ms)
Generator Voiced Sounds
(vowels)
L Time Varyino
a? ) Digital Filtero spew] Amplitude Fricative
Sounds (5) Plosives (P) Random Number
Generator IllZl EEG State Identiﬁcation How to Distinguish State 1 from State 2?
(a) “FFT”
(b) good power spectral estimates
(c) parameters ai from AR model EEG Segmentation
Identifying Changes In Brain State I 2 3 i ll
WW wal—l v
A a B b C c D d
Procedure:
(A) Acquire samples; compute AR model.
(a) Predict new samples until error is not tolerable (l).
(B) Acquire new samples; compute new AR model.
(b) Predict new samples until error is not tolerable (2). EEG State Analysis — Current Approach
(1) Use autoregressive (AR) predictor to segment training data with this result:\ Clinical
State many
segments ‘ k)
ix) (2) (3) (4) Develop algorithm for classification pattern __> recognizer : ' '. drowsy 2 2 . epileptic feature space clusters
Pattern Recognizers (a) nearest cluster
— classical
— effective (b) neural network
— nonlinear
— not smart, but easy to program Clinical
State 2—D display
of 2 of the
AR parameters ...
View Full
Document
 Spring '08
 PRINCIPE
 Signal Processing

Click to edit the document details