1
1
Digital Speech Processing—
Lecture 13
Linear Predictive
Coding (LPC)
Introduction
2
LPC Methods
•
LPC methods are the most widely used in
speech coding, speech synthesis, speech
recognition, speaker recognition and verification
and for speech storage
– LPC methods provide extremely accurate estimates
of speech parameters, and does it extremely
efficiently
– basic idea of Linear Prediction: current speech
sample can be closely approximated as a linear
combination of past samples, i.e.,
1
(
)
(
) for some value of ,
's
α
α
=
=
−
∑
p
k
k
k
s n
s n
k
p
3
LPC Methods
for periodic signals with period
, it is obvious that
( )
(
)
but that is not what LP is doing; it is estimating
(
) from
the
(
) most recent values of
(
) by linearly
predicting
•
≈
−
<<
p
p
p
N
s n
s n
N
s n
p
p
N
s n
its value
for LP, the predictor coefficients (the
's) are determined
(computed) by
(over a finite interval)
α
•
k
minimizing the sum of squared differences
between the actual speech samples
and the linearly predicted ones
4
LPC Methods
LP is based on speech production and synthesis models
 speech can be modeled as the output of a linear,
timevarying system, excited by either quasiperiodic
pulses or noise;

•
assume that the model parameters remain constant
over speech analysis interval
LP provides a
for
estimating the parameters of the linear system (the com
±
robust, reliable and accurate method
bined
vocal tract, glottal pulse, and radiation characteristic for voiced speech)
5
LPC Methods
•
LP methods have been used in control and
information theory—called methods of system
estimation and system identification
–
used extensively in speech under group of names
including
1.
covariance method
2.
autocorrelation method
3.
lattice method
4.
inverse filter formulation
5.
spectral estimation formulation
6.
maximum likelihood method
7.
inner product method
6
Basic Principles of LP
1
1
1
( )
( )
( )
−
=
=
=
−
∑
p
k
k
k
S z
H z
GU z
a z
1
( )
(
)
( )
=
=
−
+
∑
p
k
k
s n
a s n
k
Gu n
• the timevarying digital filter
represents the effects of the glottal
pulse shape, the vocal tract IR, and
radiation at the lips
• the system is excited by an impulse
train for voiced speech, or a random
noise sequence for unvoiced speech
• this ‘allpole’ model is a natural
representation for nonnasal voiced
speech—but it also works reasonably
well for nasals and unvoiced sounds
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document