10search-handout

10search-handout - Massachusetts Institute of Technology...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Massachusetts Institute of Technology Department of Electrical Engineering & Computer Science 6.345/HST.728 Automatic Speech Recognition Spring, 2010 2/25/10 Lecture Handouts Lecture Slides: Dynamic Time Warping and Search Reading: DTW: Rabiner et al., Fundamentals of ASR , Chp 4.7 Search: Huang et al., Spoken Language Processing , Chp 12 Homework: Assignment 2 - Speech Recognition using DTW MIT Dynamic Time Warping & Search Dynamic time warping Search Graph search algorithms Dynamic programming algorithms 6.345/HST.728 Automatic Speech Recognition (2010) Search 1 MIT Word-Based Template Matching Spoken Word a45 Feature Measurement a45 a54 Word Reference Templates Pattern Similarity a45 Decision Rule a45 Output Word Whole word representation: No explicit concept of sub-word units (e.g., phones) No across-word sharing Used for both isolated- and connected-word recognition Popular in late 1970s to mid 1980s; recent renewed interest 6.345/HST.728 Automatic Speech Recognition (2010) Search 2 MIT Template Matching Mechanism Test pattern, T , and reference patterns, { R 1 ,..., R V } , are represented by sequences of feature measurements Pattern similarity is determined by aligning test pattern, T , with reference pattern, R v , with distortion D ( T , R v ) Decision rule chooses reference pattern, R * , with smallest alignment distortion D ( T , R * ) R * = arg min v D ( T , R v ) Dynamic time warping (DTW) is used to compute the best possible alignment warp, v , between T and R v , and the associated distortion D ( T , R v ) 6.345/HST.728 Automatic Speech Recognition (2010) Search 3 MIT Alignment Example m n Warp M N 1 1 m n N M 1 Test Reference 1 6.345/HST.728 Automatic Speech Recognition (2010) Search 4 MIT Digit Alignment Examples Match Mismatch 6.345/HST.728 Automatic Speech Recognition (2010) Search 5 MIT Dynamic Time Warping (DTW) Objective: an optimal alignment between variable length sequences T = { t 1 ,..., t N } and R = { r 1 ,..., r M } The overall distortion D ( T , R ) is based on a sum of local distances between elements d ( t i , r j ) A particular alignment warp, , aligns T and R via a point-to-point mapping, = ( t , r ) , of length K t t ( k ) r r ( k ) 1 k K The optimal alignment minimizes overall distortion D ( T , R ) = min D ( T , R ) D ( T , R ) = 1 M K summationdisplay k =1 d ( t t ( k ) , r r ( k ) ) m k 6.345/HST.728 Automatic Speech Recognition (2010) Search 6 MIT DTW Issues Endpoint constraints: t (1) = r (1) = 1 t ( K ) = N r ( K ) = M Monotonicity: t ( k + 1) t ( k ) r ( k + 1) r ( k ) Path weights, m k , can influence shape of optimal path Path normalization factor, M , allows comparison between different warps (e.g., with different lengths)different warps (e....
View Full Document

Page1 / 19

10search-handout - Massachusetts Institute of Technology...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online