This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: D R A F T Speech and Language Processing: An introduction to speech recognition, natural language processing, and computational linguistics. Daniel Jurafsky & James H. Martin. Copyright c circlecopyrt 2006, All rights reserved. Draft of June 30, 2007. Do not cite without permission. 3 WORDS & TRANSDUCERS How can there be any sin in sincere? Where is the good in goodbye? Meredith Wilson, The Music Man Ch. 2 introduced the regular expression, showing for example how a single search string could help a web search engine find both woodchuck and woodchucks . Hunting for singular or plural woodchucks was easy; the plural just tacks an s on to the end. But suppose we were looking for another fascinating woodland creatures; let’s say a fox , and a fish , that surly peccary and perhaps a Canadian wild goose . Hunting for the plurals of these animals takes more than just tacking on an s . The plural of fox is foxes ; of peccary , peccaries ; and of goose , geese . To confuse matters further, fish don’t usually change their form when they are plural 1 . It takes two kinds of knowledge to correctly search for singulars and plurals of these forms. Orthographic rules tell us that English words ending in-y are pluralized by changing the-y to-i- and adding an-es . Morphological rules tell us that fish has a null plural, and that the plural of goose is formed by changing the vowel. The problem of recognizing that a word (like foxes ) breaks down into component morphemes ( fox and-es ) and building a structured representation of this fact is called morphological parsing . MORPHOLOGICAL PARSING Parsing means taking an input and producing some sort of linguistic structure for it. PARSING We will use the term parsing very broadly throughout this book, including many kinds of structures that might be produced; morphological, syntactic, semantic, discourse; in the form of a string, or a tree, or a network. Morphological parsing or stemming applies to many affixes other than plurals; for example we might need to take any English verb form ending in-ing ( going , talking , congratulating ) and parse it into its verbal stem plus the-ing morpheme. So given the surface or input form going , we might want to SURFACE produce the parsed form VERB-go + GERUND-ing . Morphological parsing is important throughout speech and language processing. It plays a crucial role in part-of-speech tagging for morphologically complex languages like Russian or German, as we will see in Ch. 5. It is important for producing the large dictionaries that are necessary for robust spell-checking. We will need it in ma- 1 (see e.g., Seuss (1960)) D R A F T 2 Chapter 3. Words & Transducers chine translation to realize for example that the French words va and aller should both translate to forms of the English verb go ....
View Full Document
This note was uploaded on 02/11/2012 for the course ECE 5527 taught by Professor Staff during the Fall '11 term at FIT.
- Fall '11
- Natural Language Processing