10pattern-handout

10pattern-handout - Massachusetts Institute of Technology...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Massachusetts Institute of Technology Department of Electrical Engineering & Computer Science 6.345/HST.728 Automatic Speech Recognition Spring, 2010 3/16/10 Lecture Handouts • Unsupervised Pattern Discovery in Speech • Reading: – Park et al., “Unsupervised Pattern Discovery in Speech,” IEEE Trans. ASLP , 2008. 1 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 1 • Find recurring acoustic patterns in untranscribed speech and use them to discover words, phrases and topics • Unlike many supervised approaches to speech processing – How much can we learn from speech directly ? – How much can we learn from speech alone ? Unsupervised Pattern Detection in Speech 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 2 • Traditional paradigm of speech recognition straightforward – Top down, highly supervised • Alternative is using pattern discovery to learn from speech – Bottom up, unsupervised Pattern Discovery “cat” “dog” “fux capacitor” Contrast to ASR Speech Recognizer “look at the cat” 2 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 3 • Language acquisition – 8 month old infants exposed to stream of CV syllables (Saffran et al.) – Stream composed of nonsense words (e.g. pabiku maliki wabufa ... ) – After only 2 minutes of exposure, infants can distinguish words from … non-words ( pabiku vs. ku mali or ki wabu ) • Computational genomics – Sort junk DNA from coding DNA – Genes not known ahead of time (Forced unsupervision!) – Compare genomes of multiple species to discover genes Two Sources of Inspiration 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 4 Task Target Data Input Task Target Data Input • Unsupervised language processing explored by many – Most work uses text or speech recognition output Related Work Input Words Chars Phone(me)s Audio Target Data Text Speech Task Grammar Induction Word Acquisition 3 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 5 1. Acous)c Matching 2. Acous)c Clustering 3. Cluster Iden)fca)on 4. Acous)c Segmenta)on Overview 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 6 • Experiments not cognitively oriented (no child-directed speech) • Use a corpus of audio lectures, recorded at MIT – Consists of ~500 hours of speech – Includes academic lectures, seminars, forums – Wide range of subjects, many different speakers • Most lectures have: – Consistent acoustic environment – A single main speaker (some exceptions) Speech Data 4 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 7 Walter Lewin Physics Thomas Friedman The World is Flat Gilbert Strang Linear Algebra Jim Glass ASR: Clustering & VQ Victor Zue ASR: Speech Production T.J. Hazen ASR: Speaker Adaptation Lectures 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery 8 Acoustic Matching 5 6.345/HST.728 Automatic Speech Recognition (2010) Pattern Discovery...
View Full Document

This note was uploaded on 05/08/2010 for the course CS 6.345 taught by Professor Glass during the Spring '10 term at MIT.

Page1 / 30

10pattern-handout - Massachusetts Institute of Technology...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online