224s.09.lec3

224s.09.lec3 - CS 224S / LINGUIST 281 Speech Recognition,...

Info iconThis preview shows pages 1–15. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 224S / LINGUIST 281 Speech Recognition, Synthesis, and Dialogue Dan Jurafsky Lecture 2: TTS: Brief History, Text Normalization and Part-of-Speech Tagging IP Notice: lots of info, text, and diagrams on these slides comes (thanks!) from Alan Blacks excellent lecture notes and from Richard Sproats slides. Outline I. History of Speech Synthesis II. State of the Art Demos III. Brief Architectural Overview IV. Text Processing 1) Text Normalization Tokenization End of sentence detection Methodology: decision trees 2) Homograph disambiguation 3) Part-of-speech tagging Methodology: Hidden Markov Models Dave Barry on TTS And computers are getting smarter all the time; scientists tell us that soon they will be able to talk with us. (By "they", I mean computers; I doubt scientists will ever be able to talk to us.) History of TTS Pictures and some text from Hartmut Traunmllers web site: http://www.ling.su.se/staff/hartmut/kemplne.htm Von Kempeln 1780 b. Bratislava 1734 d. Vienna 1804 Leather resonator manipulated by the operator to try and copy vocal tract configuration during sonorants (vowels, glides, nasals) Bellows provided air stream, counterweight provided inhalation Vibrating reed produced periodic pressure wave Von Kempelen: Small whistles controlled consonants Rubber mouth and nose; nose had to be covered with two fingers for non-nasals Unvoiced sounds: mouth covered, auxiliary bellows driven by string provides puff of air From Traunmllers web site Closer to a natural vocal tract: Riesz 1937 Homer Dudley 1939 VODER Synthesizing speech by electrical means 1939 Worlds Fair Homer Dudleys VODER Manually controlled through complex keyboard Operator training was a problem An aside on demos That last slide Exhibited Rule 1 of playing a speech synthesis demo: Always have a human say what the words are right before you have the system say them The 1936 UK Speaking Clock From http://web.ukonline.co.uk/freshwater/clocks/spkgclock.htm The UK Speaking Clock July 24, 1936 Photographic storage on 4 glass disks 2 disks for minutes, 1 for hour, one for seconds. Other words in sentence distributed across 4 disks, so all 4 used at once. Voice of Miss J. Cain A technician adjusts the amplifiers of the first speaking clock From http://web.ukonline.co.uk/freshwater/clocks/spkgclock.htm Gunnar Fants OVE synthesizer Of the Royal Institute of Technology, Stockholm Formant Synthesizer for vowels F1 and F2 could be controlled From Traunmllers web site Coopers Pattern Playback Haskins Labs for investigating speech...
View Full Document

Page1 / 93

224s.09.lec3 - CS 224S / LINGUIST 281 Speech Recognition,...

This preview shows document pages 1 - 15. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online