jurafsky&martin_3rdEd_17 (1).pdf

1111 top sdumpedvbd vpdumpedvbd ppintop npbinnn

Info icon This preview shows pages 226–228. Sign up to view the full content.

, extended from Fig. 11.11 . TOP S(dumped,VBD) VP(dumped,VBD) PP(into,P) NP(bin,NN) NN(bin,NN) bin DT(a,DT) a P(into,P) into NP(sacks,NNS) NNS(sacks,NNS) sacks VBD(dumped,VBD) dumped NP(workers,NNS) NNS(workers,NNS) workers Internal Rules Lexical Rules TOP ! S(dumped,VBD) NNS(workers,NNS) ! workers S(dumped,VBD) ! NP(workers,NNS) VP(dumped,VBD) VBD(dumped,VBD) ! dumped NP(workers,NNS) ! NNS(workers,NNS) NNS(sacks,NNS) ! sacks VP(dumped,VBD) ! VBD(dumped, VBD) NP(sacks,NNS) PP(into,P) P(into,P) ! into PP(into,P) ! P(into,P) NP(bin,NN) DT(a,DT) ! a NP(bin,NN) ! DT(a,DT) NN(bin,NN) NN(bin,NN) ! bin Figure 13.10 A lexicalized tree, including head tags, for a WSJ sentence, adapted from Collins (1999) . Below we show the PCFG rules that would be needed for this parse tree, internal rules on the left, and lexical rules on the right. To generate such a lexicalized tree, each PCFG rule must be augmented to iden- tify one right-hand constituent to be the head daughter. The headword for a node is then set to the headword of its head daughter, and the head tag to the part-of-speech tag of the headword. Recall that we gave in Fig. 11.12 a set of hand-written rules for identifying the heads of particular constituents. A natural way to think of a lexicalized grammar is as a parent annotation, that is, as a simple context-free grammar with many copies of each rule, one copy for each possible headword/head tag for each constituent. Thinking of a probabilistic lexicalized CFG in this way would lead to the set of simple PCFG rules shown below the tree in Fig. 13.10 . Note that Fig. 13.10 shows two kinds of rules: lexical rules , which express Lexical rules the expansion of a pre-terminal to a word, and internal rules , which express the Internal rule other rule expansions. We need to distinguish these kinds of rules in a lexicalized grammar because they are associated with very different kinds of probabilities. The lexical rules are deterministic, that is, they have probability 1.0 since a lexicalized pre-terminal like NN ( bin , NN ) can only expand to the word bin . But for the internal rules, we need to estimate probabilities.
Image of page 226

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

13.6 P ROBABILISTIC L EXICALIZED CFG S 227 Suppose we were to treat a probabilistic lexicalized CFG like a really big CFG that just happened to have lots of very complex non-terminals and estimate the probabilities for each rule from maximum likelihood estimates. Thus, according to Eq. 13.18 , the MLE estimate for the probability for the rule P(VP(dumped,VBD) ! VBD(dumped, VBD) NP(sacks,NNS) PP(into,P)) would be Count(VP(dumped,VBD) ! VBD(dumped, VBD) NP(sacks,NNS) PP(into,P)) Count(VP(dumped,VBD)) (13.23) But there’s no way we can get good estimates of counts like those in ( 13.23 ) because they are so specific: we’re unlikely to see many (or even any) instances of a sentence with a verb phrase headed by dumped that has one NP argument headed by sacks and a PP argument headed by into . In other words, counts of fully lexicalized PCFG rules like this will be far too sparse, and most rule probabilities will come out 0.
Image of page 227
Image of page 228
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern