ECE5526_Homework1_Bowden

ECE5526_Homework1_Bowden - ECE5526 Speech Recognition...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ECE5526 Speech Recognition Homework 1 Trevor Bowden 2010-02-05 Problem 1 For Problem 1, the spectrograms of several TIMIT audio files are explored using the supplied specgram_nist MATLAB script. The first audio file examined is titled SA1. specgram_nist(400, 200, 16000, 'C:\...', 'SA1', 'ieee-le') Figure one shows the graphical output from the script. 0.5 1 1.5 2 2.5 3-1 1 2 Normalized Magnitude Frequency [Hz] Time [s] SPECTROGRAM 0.5 1 1.5 2 2.5 3 2000 4000 6000 8000 "She had your dark suit in greasy wash water all year." h# sh ix she hv ae dcl jh had axr your dcl d aa r kcl k dark s ux q suit en in gcl g r iy s iy greasy w aa sh wash epi w aa dx ax water q aa l all y ih axr year h# Figure 1: Temporal, Spectrographic and Phonetic Information for SA1 Audio File The script file provides many pieces of information about the TIMIT audio file in a graphical format. At the top of the figure the magnitude vs. time plot is shown, the middle of the figure contains the spectrogram and the bottom of the figure contains word and phonemes with temporal boundaries. All three sections of the graph are time aligned. The spectrogram has horizontal striations due to the large window size of 400 samples. A larger window size provides greater detail in the frequency domain but less detail in the time domain. The next audio file analyzed is titled SA2. The following script and figure for the file are shown below. specgram_nist(50, 400, 16000, 'C:\...', 'SA2', 'ieee-le') 0.5 1 1.5 2 2.5-1 1 2 Normalized Magnitude Frequency [Hz] Time [s] SPECTROGRAM 0.5 1 1.5 2 2.5 2000 4000 6000 8000 "Don't ask me to carry an oily rag like that." h# d ow n don't ae s kcl ask m iy me tcl t ax-h to kcl k ae r iy carry ix n an oy l ix oily r ae gcl g rag l ay kcl like dh ae q that h# Figure 2: Temporal, Spectrographic and Phonetic Information for SA2 Audio File For the SA2 audio file, the window size was reduced by a factor of eight. The striations are now vertical, providing greater detail in the time domain relative to the frequency domain. In order to perform the analysis with a window of such small size, the frame rate had to be increased to allow for more temporal detail. Another audio file was examined with the title SI1067. specgram_nist(400, 400, 16000, 'C:\...', 'SI1067', 'ieee-le') Figure three shows the script output on the next page. 0.5 1 1.5 2-2-1 1 Normalized Magnitude Frequency [Hz] Time [s] SPECTROGRAM 0.5 1 1.5 2 2000 4000 6000 8000 "Their gait is impossible to convey in words." h# dh axr their gcl g ey tcl gait ix z is em pcl p aa s ax-h bcl b el impossible dcl d ax to kcl k en v ey convey ix n in w er dcl d...
View Full Document

Page1 / 10

ECE5526_Homework1_Bowden - ECE5526 Speech Recognition...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online