25 Pages

lec11

Course: C 335, Fall 2009
School: UMBC
Rating:
 
 
 
 
 

Word Count: 716

Document Preview

in Speech Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007 Outline Introduction Topics in speech processing Speech coding Speech recognition Speech synthesis Speaker verification/recognition Conclusion Introduction Speech is our basic communication tool. We have been hoping to be able to communicate with machines using speech. C3PO and R2D2 Speech Production Model Anatomy...

Register Now

Unformatted Document Excerpt

Coursehero >> Maryland >> UMBC >> C 335

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
in Speech Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007 Outline Introduction Topics in speech processing Speech coding Speech recognition Speech synthesis Speaker verification/recognition Conclusion Introduction Speech is our basic communication tool. We have been hoping to be able to communicate with machines using speech. C3PO and R2D2 Speech Production Model Anatomy Structure Mechanical Model Characteristics of Digital Speech 1 0 . 8 0 . 6 0 . 4 0 . 2 0 -. 0 2 -. 0 4 Waveform Speech -. 0 6 -. 0 8 0 1 0 . 9 0 . 8 0 . 7 F qec r uny e 0 . 6 0 . 5 0 . 4 0 . 3 0 . 2 0 . 1 0 0 0 . 5 1 1 . 5 2 x0 1 4 Spectrogram 20 00 40 00 60 00 Te i m 80 00 100 00 Voiced and Unvoiced Speech 0 . 4 0 . 3 0 . 2 0 . 1 0 -. 0 1 -. 0 2 -. 0 3 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 10 0 0 Silence unvoiced voiced Short-time Parameters 0 . 4 0 . 3 0 . 2 0 . 1 0 -. 0 1 -. 0 2 -. 0 3 0 Short time power 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 10 0 0 0 . 4 0 . 3 0 . 2 0 . 1 Waveform Envelop 0 -. 0 1 -. 0 2 -. 0 3 0 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 10 0 0 0 . 4 0 . 3 0 . 2 0 . 1 0 Zero crossing rate -. 0 1 -. 0 2 -. 0 3 0 0 . 5 0 . 4 0 . 3 0 . 2 0 . 1 0 -. 0 1 -. 0 2 -. 0 3 -. 0 4 -. 0 5 0 10 0 20 0 30 0 40 0 50 0 60 0 70 0 80 0 90 0 10 00 Pitch period 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 Speech Coding Similar to images, we can also compress speech to make it smaller and easier to store and transmit. General compression methods such as DPCM can also be used. More compression can be achieved by taking advantage of the speech production model. There are two classes of speech coders: Waveform coder Vocoder LPC Speech Coder Vocal track Parameter speech Speech buffer Speech Analysis Pitch Voiced/ unvoiced Energy Parameter Quantizer Code generation Code stream Frame n Frame n+1 LPC and Vocal Track Mathematically, speech can be modeled as the following generation model: x(n) = p=1k ap x(n-p) + e(n) {a1, a2, ..., ak} are called Linear Prediction Coefficients (LPC), which can be used to model the shape of vocal track. e(n) is the excitation to generate the speech. Decoding and Speech Synthesis Pitch Period Impulse Train Generator Glottal Pulse Generator Gain Vocal Track Model Radiation Model speech Random Noise Generator U/V An Example for Synthesizing Speech Glottal Pulse Go through vocal track filter with gain control Go through radiation filter Blending region LPC10 (FS1015) 2.4kbps LPC10 was DOD speech coding standard for voice communication at 2.4kbps. LPC10 works on speech of 8Hz, using a 22.5ms frame and 10 LPC coefficients. Original Speech LPC Decoded Speech Mixed Excitation LP For real speech, excitation the is usually not pure pulse or noise but a mixture. The new 2.4kbps standard (MELP) addresses this problem. Gain pulses noise Bandpass filter w + Vocal Track Model Radiation Model speech Bandpass filter 1-w Original Speech MELP Decoded Speech Hybrid Speech Codecs For higher bit rate speech coders, hybrid speech codecs have more advantage than vocoders. speech "perceptual" comparison Model parameter generation Speech synthesis code Analysis by Synthesis FS1016: CELP (Code Excitation Linear Predictive) G.723.1: A dual bit rate codec (5.3kbps and 6.3kbps) for multimedia communication through Internet. Sound at 5.3kbps Sound at 6.3kbps G.729: CELP based codec at 8kbps. Sound at 8kbps Speech Recognition Speech recognition is the foundation of human computer interaction using speech. Speech recognition in different contexts Dependent or independent on the speaker. Discrete words or continuous speech. Small vocabulary or large vocabulary. In quiet environment or noisy environment. Reference patterns speech Parameter analyzer Comparison and decision algorithm Language model Words How does Speech Recognition Work? Words: Phonemes: grey whales g r ey w ey l z Each phoneme has different characteristics (for example, The power distribution). Speech Recognition g g r ey ey ey ey w ey ey l l z How do we "match" the word when there are time and other variations? Hidden Markov Model P12 S1 S2 {a,b,c,...} {a,b,c,...} S3 {a,b,c,...} Dynamic Programming in Decoding time states We can find a path that corresponds to max-probable phonemes to generate the observation "feature" (extracted in each speech frame) sequence. HMM for a Unigram Language Model p1 s0 p2 HMM1 (word1) HMM2 (word2) p3 HMM3 (wordn) Speech Synthesis Speech synthesis is to generate (arbitrary) speech with desired prosperities (pitch, speed, loudness, articulation mode, etc.) Speech synthesis has been widely used for text-tospeech systems and different telephone services. The easiest and most often used speech synthesis method is waveform concatenation. Increase the pitch without changing the speed Speaker Recognition Identifying or verifying the identity of a speaker is an application where computer exceeds human being. Vocal track parameter can be used as a feature for speaker recognition. 1 0 9 8 7 6 5 4 3 2 1 1 1 0 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 1 0 2 3 4 5 6 7 8 9 1 0 Speaker one LPC covariance feature Speaker two Applications Speech recognition Call routing Document input Operator Services Voice Commands Directory Assistance Speaker recognition Fraud Control Voice over Internet Speech Coding Wireless Telephone Personalized service Document Correction Speech Interface Text-to-Speech synthesis
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

UMBC - C - 335
Multimedia Over IP Networks - IIHao Jiang Computer Science Department Boston College Nov. 8, 2007CS335 Principles of Multimedia SystemsReal-time Transport Protocol Real-time transport protocol (RTP) is an Internetstandard protocol for transmitt
UMBC - C - 101
CS Lab Teaching Assistant Hours, Spring 2008Sunday10-11 AM 11-11:30 AM HALF HOUR 11:30Noon HALF HOUR 12-1 PM 1-2 PM 2-3 PM 3-4 PM 4-5 PM 5-6 PM 6-7 PMMondayAndrew Christmas CS021-Ames (1) CS021-Brown(2) Andrew Christmas (entire hour) CS021-Ames
UMBC - CS - 335
Michael Hartel Geoff Sullivan CS335 Final Project Motion Detection in Video Games Abstract:The goal of our project was to make a two-dimensional video game that took a user's movements in order to control a character or object on a screen. We wanted
UMBC - C - 101
1. In the first loop n = 0 c = 1 + 1 = 2 b = 1 a = 2 In the second loop n = 1 c = 1 + 2 = 3 b = 2 a = 3 In the throed loop n = 2 c = 2 + 3 = 5 b = 3 a = 5 n+ (3)
UMBC - MC - 606
Variance Reduction Techniques1Outlines s s s ss s sImportance of Variance Reduction Types of Variance Reduction Techniques Common Random Numbers Example: Common Random Numbers Implementing Common Random Numbers in Arena Antithetic Variate
UMBC - CS - 021
Computers in Management CS021-Brown Fall 2005Outline and Calendar: The following outline represents my target for this semester. Changes may be made as the term progresses. Class Text Chapters Date Topic 1 Intro, Email, Web Browsing W 9/7 2 Spreads
UMBC - MC - 606
121.6271220.7943321.28132416.756108516.7967232621.69768928718.23653571811.12974828915.025339311017.049295731123.740918291216.830527321319.763730931417.373972371516.611088951615.521315581719.237946231816.2695384919
UMBC - CS - 074
The Tell-Tale Heart. TRUE! - nervous - very, very dreadfully nervous I had been andam; but why will you say that I am mad? The disease had sharpened mysenses - not destroyed - not dulled them. Above all was the sense ofhearing acute. I heard
UMBC - EC - 271
Chapter 11Controversies in Trade PolicySlides prepared by Thomas BishopPreview Arguments for "activist" trade policies Externality or appropriability problem Strategic trade policy with imperfect competition Arguments concerning trade and
UMBC - EC - 271
Chapter 9The Political Economy of Trade PolicySlides prepared by Thomas BishopPreview The cases for free trade The cases against free trade Political models of trade policy International negotiations of trade policy and the World Trade Organ
UMBC - EC - 204
C H A P T E RMoney Supply and Money Demand18MACROECONOMICS SIXTH EDITIONN. GREGORY MANKIW 2007 Worth Publishers, all rights reservedAdapted for EC 204 by Prof. Bob MurphyIn this chapter, you will learn. how the banking system "creates"
UMBC - EC - 271
Chapter 13Exchange Rates and the Foreign Exchange Market: An Asset ApproachSlides prepared by Thomas BishopPreview The basics of exchange rates Exchange rates and the prices of goods The foreign exchange markets The demand for currency and o
UMBC - EC - 271
Chapter 12National Income Accounting and the Balance of PaymentsSlides prepared by Thomas BishopPreview National income accounts measures of national income measures of value of production measures of value of expenditure National saving,
UMBC - EC - 204
C H A P T E RThe Science of Macroeconomics Adapted for EC 204 byProf. Bob Murphy1MACROECONOMICS SIXTH EDITIONN. GREGORY MANKIW 2007 Worth Publishers, all rights reservedLearning ObjectivesThis chapter introduces you to the issues macro
UMBC - EC - 271
Chapter 1IntroductionSlides prepared by Thomas BishopPreviewWhat is international economics about? Gains from trade Explaining patterns of trade The effects of government policies on trade International finance topics International trade vers
UMBC - EC - 271
Chapter 5The Standard Trade ModelSlides prepared by Thomas BishopPreview Measuring the values of production and consumption Welfare and terms of trade Effects of economic growth Effects of international transfers of income Effects of import
UMBC - EC - 204
C H A P T E RNational Income: Where it Comes From and Where it Goes Adapted for EC 204 by3MACROECONOMICS SIXTH EDITIONN. GREGORY MANKIW 2007 Worth Publishers, all rights reservedProf. Bob MurphyIn this chapter, you will learn. what d
UMBC - EC - 204
C H A P T E RTechnology, Empirics, and Policy Adapted for EC 204 by Prof. Bob Murphy8 Economic Growth II:MACROECONOMICS SIXTH EDITIONN. GREGORY MANKIW 2007 Worth Publishers, all rights reservedIn this chapter, you will learn. how to inco
UMBC - EC - 271
Chapter 7International Factor MovementsSlides prepared by Thomas BishopPreview International labor mobility International borrowing and lending Foreign direct investment and multinational firmsCopyright 2006 Pearson Addison-Wesley. All rig
Wisconsin - ECON - 102
Economics 102 Morning Lecture Second Midterm 4/13/04Student Name : Section # : TA Name :Version 1DO NOT BEGIN WORKING UNTIL THE INSTRUCTOR TELLS YOU TO DO SO. READ THESE INSTRUCTIONS FIRST. You have 50 minutes to complete the exam. The exam cons
Wisconsin - ENGR - 171
Introduction toGeological EngineeringA multidisciplinary degree programEngineeringDefinition Engineering is the design, analysis, and/or construction of works for practical purposes (Wikipedia 2007).2Engineering Disciplines Biome
Wisconsin - ENGR - 171
Comments on Writeups The 1st 5 points were based on grammar, sentence structure, and style. The 2nd five points were based on content. Total points = 10. Engineers still do not prefer using personal pronouns (e.g., he, she, they) and/or active writin
Wisconsin - ENGR - 594
G&G/GLE 594 Fall 2008 Practice Homework: Ground Penetrating Radar Solution 1) For the this problem we will interpret a set of 50 MHz GPR data collected in a sandgravel quarry south of Fitchburg. A constant midpoint profile (CMP) and constant offset
Wisconsin - ENGR - 594
Averaged DataStn. ID Base (start) Base (start) Alpha (path) Bravo (kiosk) Charlie (Well) Delta (2 wells) Echo (N well) Foxtrot (boulder) Golf (xroad) Base (end) Base (end) Meter G1 G19 G19 G19 G1 G1 G19 G19 G1 G1 G19 Reading Time 14:24 14:15 14:42 1
Stanford - MSANDE - 275
STANFORD UNIVERSITYWINTER 2000DEPT. OF MANAGEMENT SCIENCEAND ENGINEERINGMS&E 281 LEGAL STRATEGY & ANALYSISMidTerm ExaminationTime: 2 Hours Please return your completed exam to Vicki Fahrenholz, Terman Room 321, at or before 12 noon, Monday
Wisconsin - ENGR - 595
GLE 595 / G&G 595FIELD METHODS IN APPLIED AND ENGINEERING GEOPHYSICSFinal Project Due: December 17th, 2008 A construction company has won a bid to construct a fiber optic cable network for the city of Madison. This company has hired you as a geoph
Wisconsin - ENGR - 801
GLE 801: Analysis and Interpretation of Discrete Signals in Near Subsurface ApplicationsDante Fratta University of Wisconsin-MadisonInverse ProblemsInverse ProblemsInverse ProblemsInverse Problems Examples in Signal Processing Convolution
Wisconsin - ENGR - 801
xsysxryr00.550.500.55100.551.500.55200.552.500.55300.553.500.55400.554.500.5550150.501510151.501520152.501530153.501540154.5015501.550.501.55101.551.501.
Wisconsin - ME - 363
Homework #6 Due October 17, 2007ME 363 - Fluid MechanicsFall Semester 20071] This problem involves a shock tube: a length of pipe that contains air and is closed (sealed) at both ends. Some details of the shock tube and its operation are at http
Wisconsin - ENGR - 310
7.17: Pressure within the eye, or intraocular pressure is an important indication of glaucoma. Pressure within the eye can cause tiny blood vessels to collapse by the retina and optic nerve, damaging vision. Clinically this pressure is measured indir
Wisconsin - ENGR - 160
Executive SummaryFor some people, walkers are an important part of everyday life; for many of these people, functioning normally without them would be nearly impossible. Walkers serve as the balance and support for many impaired people. With a tool
Wisconsin - ECON - 101
Econ 101 Homework 4 Fall 2007 Due 11/12/2007 in lecture Directions: The homework will be collected in a box before the lecture. Please placeyour name, TA name and section number on top of the homework (legibly). Make sure you write your name as it a
Wisconsin - ECON - 101
Econ 101 Homework 4 Fall 2007 Due 11/12/2007 in lecture Directions: The homework will be collected in a box before the lecture. Please placeyour name, TA name and section number on top of the homework (legibly). Make sure you write your name as it a
Wisconsin - ECON - 101
Economics 101 Professor Kelly October 4, 2004Midterm 1Version 1Name:_ Section #:_ TA:_(Please see last page for discussion section and TA listings)On the bubble sheet, be sure to include your name, student id number, section number, and the v
Wisconsin - CHE - 562
Salvador Dali 1904-1989The Early Years 1904-1929Landscape Near Ampurdan, circa 1914Fiesta in Figueras, 1914-16View of Cadaqus with Shadow of Mount Pani, circa 1917Landscape (Cadaqus), circa 1919Self-Portrait in the Studio, circa 1919Bac
Wisconsin - ENGR - 562
Salvador Dali 1904-1989The Early Years 1904-1929Landscape Near Ampurdan, circa 1914Fiesta in Figueras, 1914-16View of Cadaqus with Shadow of Mount Pani, circa 1917Landscape (Cadaqus), circa 1919Self-Portrait in the Studio, circa 1919Bac
Wisconsin - CS - 367
Aoccdrnig to rseearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a total mses and you can sitll raed it wouthit
Wisconsin - ME - 363
Name _ME363 Exam 1/Spring 2005Honor Statement:Signed:_1Name _Concept Questions: Problem 1: Problem 2: Total:/50 _/25 /25 /1002Name _ For the Concept Questions, please circle the correct answer."figure 1.1 question 1" Given the fol
Wisconsin - ECE - 352
Quiz 6 InformationDate: Monday, April 21, 1997Time: Regular class hoursPlace: Section I. Room 2535, and 4610 Engineering Hall (9:55AM) Section II. Room 2535, and 1227 Engineering Hall (1:20PM)Topics for Quiz #6 Quiz 6 covers the ma
Wisconsin - ENGR - 340
SMC - 05/13/09CEE 340 Lecture Topics, Dates, and AssignmentsDate9/3, FridayLecture or RecitationLectureTopicClass procedures, Overview of structural analysis Equations of equilibrium reactions for trusses and determinate beams Characterizin
Wisconsin - PHYS - 207
Physics 207, Lecture 21, Nov. 12 Goals: Chapter 15 Use an ideal-fluid model to study fluid flow. Investigate the elastic deformation of solids and liquids Chapter 16 Recognize and use the state variables that characterize macroscopic phenomena.
Wisconsin - ENGR - 762
3Mechanisms of Cardiac ArrhythmiasGeoffrey M. WeinbergBe still my beating heart. - Poets, lovers, and musicians As one can tell from the often repeated quote above, many people over the years have placed a great deal of emphasis on the heart's rh
Wisconsin - CS - 540
- | | | Lecture 24: Inference Using Resolution | | (Chap 10) | |
Wisconsin - ENGR - 576
CEE 576 lecture notes Fall 99 Date: 11/30/99 By: Murat Guler Subject: Design of Overlays Source: Text by Huang Section 13.3, 13.413.3 Asphalt Institute Method Asphalt Overlay on Asphalt PavementDetermine the effective thickness of the existing
Wisconsin - ENGR - 576
AASHTO Design (`93)Ken Delage October 21, 1999 CEE 576Design Inputs for AASHTO Nomographs Time Constraints s Reliability s Standard Deviation s Traffic (ESALS) s Materials (MR) s Design Servicability Loss s Design Output is : Required SNReliabi
Wisconsin - ENGR - 576
Stresses and Deflections in Rigid PavementsSadi Kose Prepared for CEE 576 - Fall 99Reference : Text by Huang Sec. 4.1, 4.2, and 4.3Types of Stresses Curling: Due to temperature and moisture gradients Loading: Corner, edge, and interior Due t
Wisconsin - ENGR - 576
FATIGUE CHARACTERISTICS *Fatigue is due to repeated flexure. *Fatigue Testing for Bituminous Materials: 1)Constant Stress Loading For Thicker Pavements(HMA> 6 in) and is the main load carrying component. 2)Constant Strain Loading For Thin Pavements (
Wisconsin - ENGR - 576
CEE 576 Fall 1999 Date: 11/4/99 Subject: PCA Pavement Design Example Source: PCA Thickness Design for Concrete Highway and Street pavements (1984, reprint 1995) And Text by Huang Section 12.2.3Sample Designs Design 1: Four lane interstate pavemen
Wisconsin - ENGR - 576
CEE 576 Lecture Notes Fall 1999 Date: 11/2/99 By: Barry Paye Subject: Rigid Pavement Design Source: Text by Huang Section 12.112.1 Calibrated Mechanistic Design Procedure Relates structural models to pavement responses (Figure 12.1) Models Used
Wisconsin - ENGR - 576
CEE 576 Lecture Notes Fall 99Date: By: Subject: Source:10/19/99 Kellideon Agnew Asphalt Institute Method Text by Huang Section 11.2Design Criteria In 1981 a mechanistic-empirical method was used to develop a design procedure. Multi-layer ela
Wisconsin - ENGR - 576
CEE 576 Lecture Notes Fall 99 Date: 9/7/99 By: H. Bahia Subject: Types of Pavements Source: Text by Huang Section 1.2Functions of a Pavement 1. Load Bearing Capacity: Distribute load from tires to Subgrade 2. Seal Roadbed from Moisture, Prevent Du
Wisconsin - ENGR - 576
CEE 576 Lecture Notes Fall 99Date: 9/14/99 By: Bryan Nemeth Subject: Stresses and Strains in Flexible Pavements: Homogeneous Mass Layered Systems Source: Text by Huang Section 2.12.1: Homogeneous Mass Consider pavement to be a homogenous half
Wisconsin - ENGR - 576
CEE 576 Lecture Notes Date: 12/2/99 By: Vaishal Sheth Subject: Drainage Design Source: Text by Huang Section 8.1,8.2,8.3 8.1 Need for Drainage Layer Misconception- Good drainage is not required if the thickness design is based on saturated conditio
Wisconsin - ENGR - 576
CEE 576 Lecture Notes Fall 99Date: 10/05/99 By: Kelly McNamara Subject: Traffic Analysis Source: Text by Huang Section 6.46.4: Traffic Analysis Need to predict the number of repetitions of each axle load group during the design period Traffic
Wisconsin - ENGR - 576
CEE 576 Term ProjectADVANCED PAVEMENT DESIGNFall 99Objective: Design a flexible pavement and a rigid pavement for a selected highway in Wisconsin. Your Report should include the following sections (as a minimum) Section 1 Section 2 Group member
Wisconsin - ECE - 351
ECE 351 Section 5 Fall 2002 Final Lab Report Submitted by: Bryan Berns Michael Obi Introduction We followed the design of the given idea to create a LED `pingpong' game. This option offers a level a difficulty and creativity that the other options
Wisconsin - ENGR - 501
The BOD TestThe BOD test is used to determine the relative strength of a wastewater in terms of the amount of oxygen it will consume when discharged to natural waters.Significance of the BOD Test NPDES/WPDES Parameter Facility Planning Assessin
Wisconsin - ENGR - 501
Quality is our most important productQuality AssuranceQuality ContorlQA/QC Fundamental to laboratory work Common sense Prove data valid, mostly through documentation painful, but essential an understanding of QA/QC important to data user a
Wisconsin - ENGR - 501
CEE 501 Prof. Sonzogni QA/QC1. 2. 3. 4. 5. 6. 7.Know the difference between the limit of detection and the limit of quantitation. What is a method detection limit and instrument detection limit? Be familiar with the components of a good quality a
Wisconsin - ENGR - 501
Standard Addition Math Ax = kcx (Beer's Law) where Ax is the absorbance of the analyate in the sample, k is a proportionality constant, and cx is the unknown concentration of the analyte. AT = k(cx + cs) (when you add some analyte, the total absorpti
Wisconsin - ENGR - 361
Coincidence Site Lattices in Bubble Raft ModelsJeremy Koth Jeff Schirer Sam ZelinkaGoalsTo see what factors affect coincidence sites in crystals using the bubble raft model. To observe other grain boundary patterns that arise when creating