D09-1036 - Proceedings of the 2009 Conference on Empirical...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing , pages 343–351, Singapore, 6-7 August 2009. c 2009 ACL and AFNLP Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan and Hwee Tou Ng Department of Computer Science National University of Singapore 13 Computing Drive Singapore 117417 { linzihen,kanmy,nght } @comp.nus.edu.sg Abstract We present an implicit discourse relation classifier in the Penn Discourse Treebank (PDTB). Our classifier considers the con- text of the two arguments, word pair infor- mation, as well as the arguments’ internal constituent and dependency parses. Our results on the PDTB yields a significant 14.1% improvement over the baseline. In our error analysis, we discuss four chal- lenges in recognizing implicit relations in the PDTB. 1 Introduction In the field of discourse modeling, it is widely agreed that text is not understood in isolation, but in relation to its context. One focus in the study of discourse is to identify and label the relations between textual units (clauses, sentences, or para- graphs). Such research can enable downstream natural language processing (NLP) such as sum- marization, question answering, and textual entail- ment. For example, recognizing causal relations can assist in answering why questions. Detect- ing contrast and restatements is useful for para- phrasing and summarization systems. While dif- ferent discourse frameworks have been proposed from different perspectives (Mann and Thompson, 1988; Hobbs, 1990; Lascarides and Asher, 1993; Knott and Sanders, 1998; Webber, 2004), most ad- mit these basic types of discourse relationships be- tween textual units. When there is a discourse connective (e.g., be- cause ) between two text spans, it is often easy to recognize the relation between the spans, as most connectives are unambiguous (Miltsakaki et al., 2005; Pitler et al., 2008). On the other hand, it is difficult to recognize the discourse relations when there are no explicit textual cues. We term these cases explicit and implicit relations, respectively. While the recognition of discourse structure has been studied in the context of explicit relations (Marcu, 1998) in the past, little published work has yet attempted to recognize implicit discourse relations between text spans. Detecting implicit relations is a critical step in forming a discourse understanding of text, as many text spans do not mark their discourse re- lations with explicit cues. Recently, the Penn Dis- course Treebank (PDTB) has been released, which features discourse level annotation on both explicit and implicit relations. It provides a valuable lin- guistic resource towards understanding discourse relations and a common platform for researchers to develop discourse-centric systems. With the recent release of the second version of this cor- pus (Prasad et al., 2008), which provides a cleaner and more thorough implicit relation annotation, there is an opportunity to address this area of work.there is an opportunity to address this area of work....
View Full Document

This note was uploaded on 03/06/2012 for the course CIS 630 taught by Professor Cis630 during the Spring '08 term at UPenn.

Page1 / 9

D09-1036 - Proceedings of the 2009 Conference on Empirical...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online