lsa352.lec1

lsa352.lec1 - 1 LSA 352 Summer 2007 LSA 352 Speech...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 LSA 352 Summer 2007 LSA 352: Speech Recognition and Synthesis Dan Jurafsky Lecture 1: 1) Overview of Course 2) Refresher: Intro to Probability 3) Language Modeling IP notice: some slides for today from: Josh Goodman, Dan Klein, Bonnie Dorr, Julia Hirs Sandiway Fong 2 LSA 352 Summer 2007 Outline Overview of Course Probability Language Modeling Language Modeling means “probabilistic grammar” 3 LSA 352 Summer 2007 Definitions Speech Recognition Speech-to-Text – Input: a wavefile, – Output: string of words Speech Synthesis Text-to-Speech – Input: a string of words – Output: a wavefile 4 LSA 352 Summer 2007 Automatic Speech Recognition (ASR) Automatic Speech Understanding (ASU) Applications Dictation Telephone-based Information (directions, air travel, banking, etc) Hands-free (in car) Second language ('L2') (accent reduction) Audio archive searching Linguistic research – Automatically computing word durations, etc 5 LSA 352 Summer 2007 Applications of Speech Synthesis/Text-to- Speech (TTS) Games Telephone-based Information (directions, air travel, banking, etc) Eyes-free (in car) Reading/speaking for disabled Education: Reading tutors Education: L2 learning 6 LSA 352 Summer 2007 Applications of Speaker/Lg Recognition Language recognition for call routing Speaker Recognition: Speaker verification (binary decision) – Voice password, telephone assistant Speaker identification (one of N) – Criminal investigation 7 LSA 352 Summer 2007 History: foundational insights 1900s- 1950s Automaton: Markov 1911 Turing 1936 McCulloch-Pitts neuron (1943) – http://marr.bsee.swin.edu.au/~dtl/het704/lecture10/ann/node1.html – http://diwww.epfl.ch/mantra/tutorial/english/mcpits/html/ Shannon (1948) link between automata and Markov models Human speech processing Fletcher at Bell Labs (1920’s) Probabilistic/Information-theoretic models Shannon (1948) 8 LSA 352 Summer 2007 Synthesis precursors Von Kempelen mechanical (bellows, reeds) speech production simulacrum 1929 Channel vocoder (Dudley) 9 LSA 352 Summer 2007 History: Early Recognition • 1920’s Radio Rex Celluloid dog with iron base held within house by electromagnet against force of spring Current to magnet flowed through bridge which was sensitive to energy at 500 Hz 500 Hz energy caused bridge to vibrate, interrupting current, making dog spring forward The sound “e” (ARPAbet [eh]) in Rex has 500 Hz component QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. 10 LSA 352 Summer 2007 History: early ASR systems • 1950’s: Early Speech recognizers 1952: Bell Labs single-speaker digit recognizer – Measured energy from two bands (formants) – Built with analog electrical components – 2% error rate for single speaker, isolated digits...
View Full Document

{[ snackBarMessage ]}

Page1 / 101

lsa352.lec1 - 1 LSA 352 Summer 2007 LSA 352 Speech...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online