Ch3-Speech_Signal_Representations

Ch3-Speech_Signal_Representations - Speech Recognition...

Info icon This preview shows pages 1–10. Sign up to view the full content.

View Full Document Right Arrow Icon
Speech Recognition Speech Signal  Representations
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2/13/12 Veton K ë puska 2 Speech Signal Representations u Fourier Analysis  n Discrete-time Fourier transform  n Short-time Fourier transform  n Discrete Fourier transform  u Cepstral Analysis  n The complex cepstrum and the cepstrum  n Computational considerations  n Cepstral analysis of speech  n Applications to speech recognition  n Mel-Frequency cepstral representation  u Performance Comparison of Various  Representations 
Image of page 2
2/13/12 Veton K ë puska 3 W Speech O Block Diagram of Speech Recognition  Processing Speech Producer Acoustic Processor Speaker's Mind Ŵ Speaker Acoustic Channel Speech Recognizer Acoustic Model & Lexicon Language Model Decoding Search P(O|W) P(W)
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2/13/12 Veton K ë puska 4 Decoding and Search u We can see further details of the operationalization in Figure  presented in the next slide, which shows the components of  an HMM speech recognizer as it processes a single  utterance.  u The figure shows the recognition process in three stages. In  the  feature extraction  or  signal processing  stage, the  acoustic waveform is sampled into  frames  (usually of 10, 15,  or 20 milliseconds) which are transformed into  spectral  features u Each time window is thus represented by a  vector  of around  39  features representing this spectral information as well as  information about energy and spectral change.
Image of page 4
2/13/12 Veton K ë puska 5 Schematic Simplified Architecture of  Speech Recognizer
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2/13/12 Veton K ë puska 6 Decoding and Search u In the  acoustic modeling  or  phone recognition  stage, we  compute the likelihood of the observed spectral feature vectors  given linguistic units (words, phones, subparts of phones).  n For example, we use Gaussian Mixture Model (GMM)  classifiers to compute for each HMM state  q , corresponding  to a phone or subphone, the likelihood of a given feature  vector given this phone  p ( o | q ).  n A (simplified) way of thinking of the output of this stage is as  a sequence of probability vectors, one for each time frame,  each vector at each time frame containing the likelihoods that  each phone or subphone unit generated the acoustic feature  vector observation at that time.
Image of page 6
Feature Extraction: Mel-Filtered Cepstral  Coefficient (MFCC) Vectors
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2/13/12 Veton K ë puska 8 MFCC Feature Extraction Pre-emphasis Windowing FFT Mel filter-bank log IFFT Energy Deltas MFCC 12 Coefficients MFCC - 12 Coefficients Energy - 1 Coefficient ∆ MFCC – 12 Coefficients ∆ Energy - 1 Coefficient ∆∆ MFCC – 12 Coefficients ∆∆ Energy - 1 Coefficient u Extracting a sequence of 39-dimensional MFCC feature vectors  from a quantized digitized waveform
Image of page 8
2/13/12 Veton K ë puska 9 Feature Extraction: MFCC Vectors u
Image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 10
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern