aug29_applications_share

aug29_applications_share - INDV101 ‘Language’ INDV101...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: INDV101 ‘Language’ INDV101 ‘Language’ August 29, 2007 Applications of Linguistics Animals in the News • Next Wednesday, 05 September 2007 • 43rd anniversary – #1 spot on Billboard Singles Chart • House of the Rising Sun INDV101 – Fall 2007 INDV101 – Fall 2007 Room change • Section 19 Section – 8:00am meeting time – TA: TA: Jaime Parchment Jaime – NEW ROOM! – Chavez 316 INDV101 – Fall 2007 INDV101 – Fall 2007 Logistics • Homework 1 – due Friday, Sept 7 – Worksheet – D2L quiz, covers Winkler chs 1, 4 & D2L Course docs Course – Not turnitin.com • Next week – No class Monday (holiday) – Begin Phonetics/Phonology on Weds. INDV101 – Fall 2007 INDV101 – Fall 2007 Review • • • • • • • What is a ‘language’? (first pass) Linguistic varieties Mutual intelligibility Mutual Language vs dialect Geographic dialect, sociolect Idiolect Speech style or register INDV101 – Fall 2007 INDV101 – Fall 2007 By the end of this lecture • You should have some idea of what You these are these – speech recognition – speech synthesis – machine translation INDV101 – Fall 2007 INDV101 – Fall 2007 Points to Consider • Our understanding of how humans Our perform language tasks is limited perform • Our ability to make computers perform Our language tasks is limited in many of the same ways same INDV101 – Fall 2007 INDV101 – Fall 2007 Points to Consider • Sometimes we try to make computers Sometimes perform language tasks like humans do perform • Sometimes we abandon those efforts Sometimes and take an entirely different approach and INDV101 – Fall 2007 INDV101 – Fall 2007 Points to Consider • One important non-commercial aspect One of computational linguistics is testing and influencing linguistic theory and • Theory informs practice • Practice tests theory INDV101 – Fall 2007 INDV101 – Fall 2007 Applications of Linguistics • Humorous Anecdote 1.0 INDV101 – Fall 2007 INDV101 – Fall 2007 Speech Recognition • You talk, computer writes what you say • Commercial software • Training Training – Reading word lists INDV101 – Fall 2007 INDV101 – Fall 2007 Why Proofread? • Error rate – Best are around 93% accurate INDV101 – Fall 2007 INDV101 – Fall 2007 Why Proofread? INDV101 – Fall 2007 INDV101 – Fall 2007 Why Proofread? INDV101 – Fall 2007 INDV101 – Fall 2007 Automatic Speech Recognition • “Speech to Text” processor • Input – Speech Sounds • Output – Text or Action INDV101 – Fall 2007 INDV101 – Fall 2007 Automatic Speech Recognition • PROBLEM: PROBLEM: – Different people sound different • Lack of Invariance Problem – There is no single invariant feature that There allows us to recognize different voices allows INDV101 – Fall 2007 INDV101 – Fall 2007 Lack of Invariance • Question – How do people deal with it? • Answer – We don’t have a complete understanding We of how humans do it. of INDV101 – Fall 2007 INDV101 – Fall 2007 Frightfully Brief and Woefully Incomplete Introduction to Acoustic Analysis • Sound = pressure waves • Waveform 0 .0 3 6 87 0 - 0 .0 4 4 13 0 T im e (s ) 1. 4 8 8 INDV101 – Fall 2007 INDV101 – Fall 2007 Comparing sounds INDV101 – Fall 2007 INDV101 – Fall 2007 Comparing sounds INDV101 – Fall 2007 INDV101 – Fall 2007 Comparing sounds INDV101 – Fall 2007 INDV101 – Fall 2007 Single Speaker Recognition • Dr. Martinez is just one person – Why the confusion? INDV101 – Fall 2007 INDV101 – Fall 2007 Compare these Waveforms INDV101 – Fall 2007 INDV101 – Fall 2007 Compare these Waveforms INDV101 – Fall 2007 INDV101 – Fall 2007 Dealing With New Voices • Speaker normalization – The process of “adjusting” your perception to The accommodate a new speaker accommodate INDV101 – Fall 2007 INDV101 – Fall 2007 One Strategy for ASR • Limit functionality to just one speaker – Train on just that speaker – Record word lists to learn what sound Record combinations look like for that speaker combinations INDV101 – Fall 2007 INDV101 – Fall 2007 Phonotactics • Rules of allowable sound sequences in Rules a given language given – English does not allow /ps/ at the English beginning of words beginning – Greek does – English has flaps only between vowels – Spanish uses flaps more freely INDV101 – Fall 2007 INDV101 – Fall 2007 Word Lists for Training /i/ /p/ /t/ /g/ /pik/ peek /tin/ teen /gik/ geek /e/ /pet/ pet /ten/ ten /ges/ guess /l/ /plƏm/ plum /ætlƏs/ atlas /glƏv/ glove INDV101 – Fall 2007 INDV101 – Fall 2007 Guessing Game • Conditional probability – Given my best guesses for the first sound, how Given likely are my best guesses for the second sound? likely INDV101 – Fall 2007 INDV101 – Fall 2007 Guessing Game m n b ∅ a æ o ∅ ɾ d j ∅ t p ∅ i I ∅ n m ∅ e I æ ∅ s f θ ∅ No flaps before consonants in English – only between vowels INDV101 – Fall 2007 INDV101 – Fall 2007 Guessing Game m n b ∅ m a æ o ∅ a ɾ d j ∅ j ∅ p ∅ i ∅ n t p i I n m e I æ ∅ I s f θ ∅ s Martinez my penis INDV101 – Fall 2007 INDV101 – Fall 2007 Dealing with Many Speakers • Strategies for making the problem Strategies easier? easier? – Limit the lexicon INDV101 – Fall 2007 INDV101 – Fall 2007 Examples of Limited Lexicons • Systems that expect numbers – Everything is interpreted as a number – Expectation of cooperation INDV101 – Fall 2007 INDV101 – Fall 2007 Examples of Limited Lexicons • Systems that expect city names – Interesting training case: shouting INDV101 – Fall 2007 INDV101 – Fall 2007 Speech Recognition Relies on Context • Acoustic context Acoustic – adjacent sounds • Lexical context Lexical – legitimate word? • Pragmatic context Pragmatic – make sense? INDV101 – Fall 2007 INDV101 – Fall 2007 Speech Synthesis • Creating understandable speech • Is this problem easier or harder than Is Speech Recognition? Speech INDV101 – Fall 2007 INDV101 – Fall 2007 Speech Synthesis • Creating understandable speech • Is this problem easier or harder than Is Speech Recognition? Speech – I think it’s easier. – We understand bad speech better than We machines understand clear speech. machines INDV101 – Fall 2007 INDV101 – Fall 2007 Speech Synthesis Examples • 1980s • Current INDV101 – Fall 2007 INDV101 – Fall 2007 Speech Synthesis • Difficult aspects of speech synthesis Difficult parallel difficulties in speech recognition. parallel – The natural speech signal is extremely The complex complex • Hard to decipher • Hard to reproduce INDV101 – Fall 2007 INDV101 – Fall 2007 Machine Translation • Input: Text in one language • Output: Text in another language • Easier or harder? – Hard comparison to make – No sound to worry about, No but meaning is critical but INDV101 – Fall 2007 INDV101 – Fall 2007 Machine Translation • 1960 – “Ten years away” • 1970 – “Ten years away” • Current – Computer Aided Translation INDV101 – Fall 2007 INDV101 – Fall 2007 Machine Translation • Rule-based translation • “Logico-Semantic” approach • Corpus-based translation INDV101 – Fall 2007 INDV101 – Fall 2007 Computational Modeling • Doesn’t produce anything marketable • Intended to test theories – Doesn’t directly say what a human brain Doesn’t can do can – Does place limits on the problem, Does particularly questions of learnability particularly INDV101 – Fall 2007 INDV101 – Fall 2007 Summary • Efforts to make computers “do language” better parallel Efforts our understanding of how humans “do language”. our • Sometimes it’s best to mimic a system (human brain) that Sometimes works (rule-based and cognitive approaches). works • Sometimes it’s best to take advantage of the strengths of Sometimes the machine (statistical and corpus-based approaches). the • Efforts to directly model human language use test our Efforts theories of how language works, particularly boundaries of learnability. learnability. INDV101 – Fall 2007 INDV101 – Fall 2007 INDV101 – Fall 2007 INDV101 – Fall 2007 ...
View Full Document

Ask a homework question - tutors are online