lect13-probparsing.key

lect13-probparsing.key - Probabilistic Parsing in Practice...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Probabilistic Parsing in Practice Lecture #15 Computational Linguistics CMPSCI 591N, Spring 2006 Andrew McCallum (including slides from Michael Collins, Chris Manning, Jason Eisner, Mary Harper) ndrew McCallum, UMass Todays Main Points Training data How to evaluate parsers Limitations of PCFGs, enhancements & alternatives- Lexicalized PCFGs- Structure sensitivity- Left-corner parsing- Faster parsing with beam search- Dependency parsers Current state of the art ndrew McCallum, UMass Todays Main Points Training data How to evaluate parsers Limitations of PCFGs, enhancements & alternatives- Lexicalized PCFGs- Structure sensitivity- Left-corner parsing- Faster parsing with beam search- Dependency parsers Current state of the art Treebanks Pure Grammar Induction Approaches tend not to produce the parse trees that people want Solution Give a some example of parse trees that we want Make a learning tool learn a grammar Treebank A collection of such example parses PennTreebank is most widely used Treebanks Penn Treebank Trees are represented via bracketing Fairly fat structures or Noun Phrases (NP Arizona real estate loans) Tagged with grammatical and semantic unctions (-SBJ , LOC, ) Use empty nodes(*) to indicate understood subjects and extraction gaps ( ( S ( NP-SBJ The move) ( VP followed ( NP ( NP a round ) ( PP of (NP ( NP similar increases ) ( PP by ( NP other lenders ) ) ( PP against ( NP Arizona real estate loans ))))) , ( S-ADV ( NP-SBJ * ) ( VP reflecting ( NP a continuing decline ) ( PP-LOC in (NP that market )))))) . ) Treebanks Many people have argued that it is better to have linguists constructing treebanks than grammars Because it is easier- to work out the correct parse of sentences than- to try to determine what all possible manifestations of a certain rule or grammatical construct are ndrew McCallum, UMass Treebanking Issues Type of data- Task dependent (newspaper, journals, novels, technical manuals, dialogs, email) Size- The more the better! (Resource-limited) Parse representation- Dependency vs Parse tree- Attributes. What do encode? words, morphology, syntax, semantics...- Reference & bookkeeping: date time, who did what ndrew McCallum, UMass Organizational Issues Team- 1 Team leader; bookkeeping/hiring- 1 Guideline person- 1 Linguistic issues person- 3-5 Annotators- 1-2 Technical staff/programming- 2 Checking persons Double annotation if possible. ndrew McCallum, UMass Treebanking Plan The main points (after getting funding)- Planning- Basic guidelines development- Annotation & guidelines refinement- Consistency checking, guidelines finalization- Packaging and distribution Time needed- on the order of 2 years per 1 million words- only about 1/3 of the total effort is annotation ndrew McCallum, UMass Parser Evaluation Evaluation Ultimate goal is to build system for IE, QA, MT People are rarely interested in syntactic analysis for its own...
View Full Document

Page1 / 58

lect13-probparsing.key - Probabilistic Parsing in Practice...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online