jurafsky&martin_3rdEd_17 (1).pdf

More formally given a sentence s and derivation d

Info icon This preview shows pages 235–237. Sign up to view the full content.

View Full Document Right Arrow Icon
More formally, given a sentence S and derivation D that contains suptertag sequence T , we have: P ( D , S ) = P ( T , S ) (13.36) = n Y i = 1 P ( t i | s i ) (13.37) To better fit with the traditional A* approach, we’d prefer to have states scored by a cost function where lower is better (i.e., we’re trying to minimize the cost of a derivation). To achieve this, we’ll use negative log probabilities to score deriva- tions; this results in the following equation, which we’ll use to score completed CCG derivations. P ( D , S ) = P ( T , S ) (13.38) = n X i = 1 - log P ( t i | s i ) (13.39) Given this model, we can define our f -cost as follows. The f -cost of an edge is the sum of two components: g ( n ) , the cost of the span represented by the edge, and
Image of page 235

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
236 C HAPTER 13 S TATISTICAL P ARSING h ( n ) , the estimate of the cost to complete a derivation containing that edge (these are often referred to as the inside and outside costs ). We’ll define g ( n ) for an edge using Equation 13.39 . That is, it is just the sum of the costs of the supertags that comprise the span. For h ( n ) , we need a score that approximates but never overestimates the actual cost of the final derivation. A simple heuristic that meets this requirement assumes that each of the words in the outside span will be assigned its most probable su- pertag . If these are the tags used in the final derivation, then its score will equal the heuristic. If any other tags are used in the final derivation the f -cost will be higher since the new tags must have higher costs, thus guaranteeing that we will not overestimate. Putting this all together, we arrive at the following definition of a suitable f -cost for an edge. f ( w i , j , t i , j ) = g ( w i , j )+ h ( w i , j ) (13.40) = j X k = i - log P ( t k | w k )+ i - 1 X k = 1 max t 2 tags ( - log P ( t | w k ))+ N X k = j + 1 max t 2 tags ( - log P ( t | w k )) As an example, consider an edge representing the word serves with the supertag N in the following example. (13.41) United serves Denver. The g -cost for this edge is just the negative log probability of the tag, or X. The outside h -cost consists of the most optimistic supertag assignments for United and Denver . The resulting f -cost for this edge is therefore x+y+z = 1.494. An Example Fig. 13.12 shows the initial agenda and the progress of a complete parse for this example. After initializing the agenda and the parse table with information from the supertagger, it selects the best edge from the agenda — the entry for United with the tag N / N and f -cost 0.591. This edge does not constitute a complete parse and is therefore used to generate new states by applying all the relevant grammar rules. In this case, applying forward application to United: N/N and serves: N results in the creation of the edge United serves: N[0,2], 1.795 to the agenda. Skipping ahead, at the the third iteration an edge representing the complete derivation United serves Denver, S[0,3], .716 is added to the agenda. However, the algorithm does not terminate at this point since the cost of this edge (.716) does not place it at the top of the agenda. Instead, the edge representing Denver with the category NP is popped. This leads to the addition of another edge to the agenda
Image of page 236
Image of page 237
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern