This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: D R A F T Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition. Daniel Jurafsky & James H. Martin. Copyright c circlecopyrt 2007, All rights reserved. Draft of September 21, 2007. Do not cite without permission. 10 SPEECH RECOGNITION: ADVANCED TOPICS True, their voice-print machine was unfortunately a crude one. It could discriminate among only a few frequencies, and it indicated amplitude by indecipherable blots. But it had never been intended for such vitally important work. Aleksandr I. Solzhenitsyn, The First Circle , p. 505 The keju civil service examinations of Imperial China lasted almost 1300 years, from the year 606 until it was abolished in 1905. In its peak, millions of would-be officials from all over China competed for high-ranking government positions by par- ticipating in a uniform examination. For the final metropolitan part of this exam in the capital city, the candidates would be locked into an examination compound for a grueling 9 days and nights answering questions about history, poetry, the Confucian classics, and policy. Naturally all these millions of candidates didnt all show up in the capital. Instead, the exam had progressive levels; candidates who passed a one-day local exam in their local prefecture could then sit for the biannual provincial exam, and only upon passing that exam in the provincial capital was a candidate eligible for the metropolitan and palace examinations. This algorithm for selecting capable officials is an instance of multi-stage search. The final 9-day process requires far too many resources (in both space and time) to examine every candidate. Instead, the algorithm uses an easier, less intensive 1-day process to come up with a preliminary list of potential candidates, and applies the final test only to this list. The keju algorithm can also be applied to speech recognition. Wed like to be able to apply very expensive algorithms in the speech recognition process, such as 4-gram, 5-gram, or even parser-based language models, or context-dependent phone models that can see two or three phones into the future or past. But there are a huge number of potential transcriptions sentences for any given waveform, and its too expensive (in time, space, or both) to apply these powerful algorithms to every single candidate. Instead, well introduce multipass decoding algorithms in which efficient but dumber decoding algorithms produce shortlists of potential candidates to be rescored by slow but smarter algorithms. Well also introduce the context-dependent acoustic model , D R A F T 2 Chapter 10. Speech Recognition: Advanced Topics which is one of these smarter knowledge sources that turns out to be essential in large- vocabulary speech recognition. Well also briefly introduce the important topics of discriminative training and the modeling of variation....
View Full Document