Lecture 1_winter_2012_robot_video_6tp

Lecture 1_winter_2012_robot_video_6tp - Speech Processing...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 1 Digital Speech Processing— Lecture 1 Introduction to Digital Speech Processing 2 Speech Processing Speech is the most natural form of human-human communications. Speech is related to language; linguistics is a branch of social science. Speech is related to human physiological capability; physiology is a branch of medical science. Speech is also related to sound and acoustics, a branch of physical science. Therefore, speech is one of the most intriguing signals that humans work with every day. Purpose of speech processing: – To understand speech as a means of communication; – To represent speech for transmission and reproduction; – To analyze speech for automatic recognition and extraction of information – To discover some physiological characteristics of the talker. 3 Why Digital Processing of Speech? • digital processing of speech signals (DPSS) enjoys an extensive theoretical and experimental base developed over the past 75 years • much research has been done since 1965 on the use of digital signal processing in speech communication problems • highly advanced implementation technology (VLSI) exists that is well matched to the computational demands of DPSS • there are abundant applications that are in widespread use commercially 4 The Speech Stack Fundamentals — acoustics, linguistics, pragmatics, speech perception Speech Representations — temporal, spectral, homomorphic, LPC Speech Algorithms —speech-silence (background), voiced-unvoiced decision, pitch detection, formant estimation Speech Applications coding, synthesis, recognition, understanding, verification, language translation, speed-up/slow-down 5 Speech Applications • We look first at the top of the speech processing stack—namely applications –speech coding –speech synthesis –speech recognition and understanding –other speech applications 6 Decom- pression D-to-A Converter Decoding/ Synthesis data speech [] y n % x n % Speech Coding ] [ ˆ n y Compression A-to-D Converter Analysis/ Coding speech data ) ( t x c ] [ n x ] [ n y ] [ ˆ n y Continuous time signal Sampled signal Transformed representation Bit sequence Channel or Medium Channel or Medium () c x t % Encoding Decoding
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 7 Speech Coding Speech Coding is the process of transforming a speech signal into a representation for efficient transmission and storage of speech – narrowband and broadband wired telephony – cellular communications – Voice over IP (VoIP) to utilize the Internet as a real-time communications medium – secure voice for privacy and encryption for national security applications – extremely narrowband communications channels, e.g., battlefield applications using HF radio – storage of speech for telephone answering machines, IVR systems, prerecorded messages 8 Demo of Speech Coding • Narrowband Speech Coding: ± 64 kbps PCM ± 32 kbps ADPCM ± 16 kbps LDCELP ± 8 kbps CELP ± 4.8 kbps FS1016 ± 2.4 kbps LPC10E Narrowband Speech • Wideband Speech Coding: Male talker / Female Talker ± 3.2 kHz – uncoded ± 7 kHz – uncoded
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 7

Lecture 1_winter_2012_robot_video_6tp - Speech Processing...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online