This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: D R A F T Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition. Daniel Jurafsky & James H. Martin. Copyright c circlecopyrt 2007, All rights reserved. Draft of September 19, 2007. Do not cite without permission. 7 PHONETICS (Upon being asked by Director George Cukor to teach Rex Harrison, the star of the 1964 film ”My Fair Lady”, how to behave like a phonetician:) “My immediate answer was, ‘I don’t have a singing butler and three maids who sing, but I will tell you what I can as an assistant professor.’” Peter Ladefoged, quoted in his obituary, LA Times, 2004 The debate between the “whole language” and “phonics” methods of teaching read- ing to children seems at very glance like a purely modern educational debate. Like many modern debates, however, this one recapitulates an important historical dialec- tic, in this case in writing systems. The earliest independently-invented writing sys- tems (Sumerian, Chinese, Mayan) were mainly logographic: one symbol represented a whole word. But from the earliest stages we can find, most such systems contain elements of syllabic or phonemic writing systems, in which symbols are used to rep- resent the sounds that make up the words. Thus the Sumerian symbol pronounced ba and meaning “ration” could also function purely as the sound /ba/. Even modern Chi- nese, which remains primarily logographic, uses sound-based characters to spell out foreign words. Purely sound-based writing systems, whether syllabic (like Japanese hiragana or katakana ), alphabetic (like the Roman alphabet used in this book), or con- sonantal (like Semitic writing systems), can generally be traced back to these early logo-syllabic systems, often as two cultures came together. Thus the Arabic, Aramaic, Hebrew, Greek, and Roman systems all derive from a West Semitic script that is pre- sumed to have been modified by Western Semitic mercenaries from a cursive form of Egyptian hieroglyphs. The Japanese syllabaries were modified from a cursive form of a set of Chinese characters which were used to represent sounds. These Chinese char- acters themselves were used in Chinese to phonetically represent the Sanskrit in the Buddhist scriptures that were brought to China in the Tang dynasty. Whatever its origins, the idea implicit in a sound-based writing system, that the spoken word is composed of smaller units of speech, is the Ur-theory that underlies all our modern theories of phonology . This idea of decomposing speech and words into smaller units also underlies the modern algorithms for speech recognition (tran- scrbining acoustic waveforms into strings of text words) and speech synthesis or text- to-speech (converting strings of text words into acoustic waveforms). D R A F T 2 Chapter 7. Phonetics In this chapter we introduce phonetics from a computational perspective. Phonetics is the study of linguistic sounds, how they are produced by the articulators of the human...
View Full Document