jurafsky&martin_3rdEd_17 (1).pdf

T s yield t p t s 137 furthermore since we showed

Info icon This preview shows pages 216–218. Sign up to view the full content.

View Full Document Right Arrow Icon
t . S = yield ( T ) P ( T , S ) (13.7) Furthermore, since we showed above that P ( T , S ) = P ( T ) , the final equation for choosing the most likely parse neatly simplifies to choosing the parse with the highest probability: ˆ T ( S ) = argmax Ts . t . S = yield ( T ) P ( T ) (13.8) 13.1.2 PCFGs for Language Modeling A second attribute of a PCFG is that it assigns a probability to the string of words constituting a sentence. This is important in language modeling , whether for use in speech recognition, machine translation, spelling correction, augmentative com- munication, or other applications. The probability of an unambiguous sentence is P ( T , S ) = P ( T ) or just the probability of the single parse tree for that sentence. The probability of an ambiguous sentence is the sum of the probabilities of all the parse trees for the sentence: P ( S ) = X Ts . t . S = yield ( T ) P ( T , S ) (13.9) = X Ts . t . S = yield ( T ) P ( T ) (13.10) An additional feature of PCFGs that is useful for language modeling is their ability to assign a probability to substrings of a sentence. For example, suppose we want to know the probability of the next word w i in a sentence given all the words we’ve seen so far w 1 ,..., w i - 1 . The general formula for this is P ( w i | w 1 , w 2 ,..., w i - 1 ) = P ( w 1 , w 2 ,..., w i - 1 , w i ) P ( w 1 , w 2 ,..., w i - 1 ) (13.11) We saw in Chapter 4 a simple approximation of this probability using N -grams, conditioning on only the last word or two instead of the entire context; thus, the bigram approximation would give us P ( w i | w 1 , w 2 ,..., w i - 1 ) P ( w i - 1 , w i ) P ( w i - 1 ) (13.12)
Image of page 216

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
13.2 P ROBABILISTIC CKY P ARSING OF PCFG S 217 But the fact that the N -gram model can only make use of a couple words of context means it is ignoring potentially useful prediction cues. Consider predicting the word after in the following sentence from Chelba and Jelinek (2000) : (13.13) the contract ended with a loss of 7 cents after trading as low as 9 cents A trigram grammar must predict after from the words 7 cents , while it seems clear that the verb ended and the subject contract would be useful predictors that a PCFG- based parser could help us make use of. Indeed, it turns out that PCFGs allow us to condition on the entire previous context w 1 , w 2 ,..., w i - 1 shown in Eq. 13.11 . In summary, this section and the previous one have shown that PCFGs can be applied both to disambiguation in syntactic parsing and to word prediction in lan- guage modeling. Both of these applications require that we be able to compute the probability of parse tree T for a given sentence S . The next few sections introduce some algorithms for computing this probability. 13.2 Probabilistic CKY Parsing of PCFGs The parsing problem for PCFGs is to produce the most-likely parse ˆ T for a given sentence S , that is, ˆ T ( S ) = argmax Ts . t . S = yield ( T ) P ( T ) (13.14) The algorithms for computing the most likely parse are simple extensions of the standard algorithms for parsing; most modern probabilistic parsers are based on the probabilistic CKY algorithm, first described by Ney (1991) .
Image of page 217
Image of page 218
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern