jurafsky&martin_3rdEd_17 (1).pdf

# 1335 word and tag based features with k and l both

• 499
• 100% (1) 1 out of 1 people found this document helpful

This preview shows pages 233–235. Sign up to view the full content.

(13.35) Word and tag-based features with k and l both set to 2 provides reasonable results given sufficient training data. Additional features such as POS tags and short char- acter suffixes are also commonly used to improve performance. Unfortunately, even with additional features the large number of possible su- pertags combined with high per-word ambiguity leads to error rates that are too high for practical use in a parser. More specifically, the single best tag sequence ˆ T will typically contain too many incorrect tags for effective parsing to take place. To overcome this, we can instead return a probability distribution over the possible supertags for each word in the input. The following table illustrates an example dis- tribution for a simple example sentence. In this table, each column represents the probability of each supertag for a given word in the context of the input sentence . The “...” represent all the remaining supertags possible for each word. United serves Denver N / N : 0.4 ( S \ NP ) / NP : 0.8 NP : 0.9 NP : 0.3 N : 0.1 N / N : 0.05 S / S : 0.1 ... ... S \ S : .05 ... In a MEMM framework, the probability of the optimal tag sequence defined in Eq. 13.35 is efficiently computed with a suitably modified version of the Viterbi

This preview has intentionally blurred sections. Sign up to view the full version.

234 C HAPTER 13 S TATISTICAL P ARSING algorithm. However, since Viterbi only finds the single best tag sequence it doesn’t provide exactly what we need here; we need to know the probability of each possible word/tag pair. The probability of any given tag for a word is the sum of the probabil- ities of all the supertag sequences that contain that tag at that location. Fortunately, we’ve seen this problem before — a table representing these values can be com- puted efficiently by using a version of the forward-backward algorithm presented in Chapter 9. The same result can also be achieved through the use of deep learning approaches based on recurrent neural networks (RNNs). Recent efforts have demonstrated con- siderable success with RNNs as alternatives to HMM-based methods. These ap- proaches differ from traditional classifier-based methods in the following ways: The use of vector-based word representations (embeddings) rather than word- based feature functions. Input representations that span the entire sentence, as opposed to size-limited sliding windows. Avoiding the use of high-level features, such as part of speech tags, since errors in tag assignment can propagate to errors in supertags. As with the forward-backward algorithm, RNN-based methods can provide a prob- ability distribution over the lexical categories for each word in the input. 13.7.4 CCG Parsing using the A* Algorithm The A* algorithm is a heuristic search method that employs an agenda to find an optimal solution. Search states representing partial solutions are added to an agenda based on a cost function, with the least-cost option being selected for further ex- ploration at each iteration. When a state representing a complete solution is first selected from the agenda, it is guaranteed to be optimal and the search terminates.
This is the end of the preview. Sign up to access the rest of the document.
• Fall '09

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern