This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project December 7, 2010 ABSTRACT The Pitch synchronous windowing is a critical part of many speech processing algorithms. Homomorphic filtering, for example, is based on the principle that the pitch frequency may be “liftered” from the vocal tract response via simple subtraction. For this to work, the window length must be 2 to 3 pitch periods in length . Linear prediction based signal reconstruction, is simplified if there are an integer number of pitch pulses per window. This study, investigates the fundamental relationship between the window type and length verses the requirement to use overlap and add methods. The study shows that the Hamming window, has superior properties to the Hann and Bartlett window, while performing well for overlap and add applications. A wide search linear prediction based approach is taken to estimate the pitch period for voiced sounds. For this analysis, the voiced speech is assumed to be completely contained within the window, and contain minimal non-voiced speech or noise. An analysis and synthesis example is presented using pitch synchronous windowing. Keywords: Pitch synchronous, Linear Prediction, Glottal pulse, inverse filtering, speech synthesis  INTRODUCTION Windowing of speech signals is often performed prior to analysis. In FFT based processing, the window type and length determines the frequency bin resolution as well as the side lobe “leakage”. To not window at all, is to window using a rectangular window that only has 13 dB of side-lobe signal suppression. In this case the frequency bins “talk” to each other as each bin contains information from its neighbors. Choosing the proper window type and length can reduce this “spectral leakage”, but only at the expense of wider main lobe frequency bins. In homomorphic filtering, if the frequency response of the window is too wide, it will include the fundamental pitch period frequency for voiced sounds. This will make it impossible to de-convolve the vocal tract response from the pitch period frequency. In glottal pulse de- convolution, the pitch period must be known so that the covariance method may be applied during the glottal closed portion of the speech. One factor used in window selection is the side lobe attenuation level verses main lobe width. A second consideration is based on the requirement to use the overlap and add method. Together, these requirements will result in the selection of the Hamming window. Once the window type has been selected, there only remains to determine the optimum window length....
View Full Document
This note was uploaded on 02/11/2012 for the course ECE 5525 taught by Professor Staff during the Fall '10 term at FIT.
- Fall '10