Speech Interface - Speech Interface Speech Overview Voice...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
Speech Interface
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Speech Overview Voice User Interface How does it work? Synthesis (TTS) Recognition (SR)
Background image of page 2
Speech Synthesis Text to Speech Dynamic Prompt database
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
How Speech Synthesis Works? Text parsing Sentences, numbers, symbols, pauses Natural language processing Part of speech, tense Phonemes are looked up or sounded out Diphones are appended together Post process audio to add emphasis Play speech audio
Background image of page 4
Speech Synthesis Approaches Formant synthesis based on acoustic features of speech: a set of filters is used to model natural speech sounds intelligible, but not very pleasant (machine like) Concatenating synthesis speech is constructed by mixing short samples of recorded natural speech together longer samples make more natural sounding speech but it is also harder to collect to samples
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Speech Synthesis Approaches Acoustic modeling synthesis models the vocal track and simulates how it works - very complex to compute even with simplified models Prosody prosody: volume, speed and pitch variations and pauses in speech if speech has no prosody, it sound very monotonic and it’s hard to understand synthesizers add basic prosody to speech synthesizers don’t know where to put emphasis and complex sentences can cause problems user can add simplex control tags to control speed, volume, pitch and pauses this control is however very rough and works in word level
Background image of page 6
Voice or Speech Recognition Voice or speech recognition systems are developing rapidly There are two different types of voice recognition: Continuous speech systems, allowing for dictation. Speaker independence, so people can enter commands or words at a given workstation.
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Speech Recognition Speech to Text Dictation Command and Control
Background image of page 8
How Recognition Works Audio signal is processed Look for signals which might be speech Phonemes are found in audio signals Phonemes are mapped to a dictionary or words Dictation or grammar-based Apply natural language processing
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Voice I/O Consider voice input in the following circumstances: Hands are busy Environment is dirty Small device with keyboard not available Dictation by a slow typist Structure voice command input so: Only one or two words are required
Background image of page 10
Voice I/O (Why Speech?) Natural: speech is the most efficient, popular and wide-spread way to communicate Efficient: in many cases speech is the most efficient communication method [Chapanis, 1975] Expressive: Some things are quite impossible to express without using speech (or natural language in general) Popular and preferred: Some people use verbal-acoustic problem solving methods instead of visual-spatial (GUI-oriented) methods [Bradford, 1995]
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 02/06/2012 for the course FACULTY OF WXGE6320 taught by Professor Noraini during the Winter '09 term at University of Malaya.

Page1 / 34

Speech Interface - Speech Interface Speech Overview Voice...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online