jurafsky&martin_3rdEd_17 (1).pdf

# T s yield t p t s 137 furthermore since we showed

• 499
• 100% (1) 1 out of 1 people found this document helpful

This preview shows pages 216–218. Sign up to view the full content.

t . S = yield ( T ) P ( T , S ) (13.7) Furthermore, since we showed above that P ( T , S ) = P ( T ) , the final equation for choosing the most likely parse neatly simplifies to choosing the parse with the highest probability: ˆ T ( S ) = argmax Ts . t . S = yield ( T ) P ( T ) (13.8) 13.1.2 PCFGs for Language Modeling A second attribute of a PCFG is that it assigns a probability to the string of words constituting a sentence. This is important in language modeling , whether for use in speech recognition, machine translation, spelling correction, augmentative com- munication, or other applications. The probability of an unambiguous sentence is P ( T , S ) = P ( T ) or just the probability of the single parse tree for that sentence. The probability of an ambiguous sentence is the sum of the probabilities of all the parse trees for the sentence: P ( S ) = X Ts . t . S = yield ( T ) P ( T , S ) (13.9) = X Ts . t . S = yield ( T ) P ( T ) (13.10) An additional feature of PCFGs that is useful for language modeling is their ability to assign a probability to substrings of a sentence. For example, suppose we want to know the probability of the next word w i in a sentence given all the words we’ve seen so far w 1 ,..., w i - 1 . The general formula for this is P ( w i | w 1 , w 2 ,..., w i - 1 ) = P ( w 1 , w 2 ,..., w i - 1 , w i ) P ( w 1 , w 2 ,..., w i - 1 ) (13.11) We saw in Chapter 4 a simple approximation of this probability using N -grams, conditioning on only the last word or two instead of the entire context; thus, the bigram approximation would give us P ( w i | w 1 , w 2 ,..., w i - 1 ) P ( w i - 1 , w i ) P ( w i - 1 ) (13.12)

This preview has intentionally blurred sections. Sign up to view the full version.

13.2 P ROBABILISTIC CKY P ARSING OF PCFG S 217 But the fact that the N -gram model can only make use of a couple words of context means it is ignoring potentially useful prediction cues. Consider predicting the word after in the following sentence from Chelba and Jelinek (2000) : (13.13) the contract ended with a loss of 7 cents after trading as low as 9 cents A trigram grammar must predict after from the words 7 cents , while it seems clear that the verb ended and the subject contract would be useful predictors that a PCFG- based parser could help us make use of. Indeed, it turns out that PCFGs allow us to condition on the entire previous context w 1 , w 2 ,..., w i - 1 shown in Eq. 13.11 . In summary, this section and the previous one have shown that PCFGs can be applied both to disambiguation in syntactic parsing and to word prediction in lan- guage modeling. Both of these applications require that we be able to compute the probability of parse tree T for a given sentence S . The next few sections introduce some algorithms for computing this probability. 13.2 Probabilistic CKY Parsing of PCFGs The parsing problem for PCFGs is to produce the most-likely parse ˆ T for a given sentence S , that is, ˆ T ( S ) = argmax Ts . t . S = yield ( T ) P ( T ) (13.14) The algorithms for computing the most likely parse are simple extensions of the standard algorithms for parsing; most modern probabilistic parsers are based on the probabilistic CKY algorithm, first described by Ney (1991) .
This is the end of the preview. Sign up to access the rest of the document.
• Fall '09

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern