N07-1054 - Proceedings of NAACL HLT 2007 , pages 428435,...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Proceedings of NAACL HLT 2007 , pages 428435, Rochester, NY, April 2007. c 2007 Association for Computational Linguistics Building and Refining Rhetorical-Semantic Relation Models Sasha Blair-Goldensohn and Google, Inc. 76 Ninth Avenue New York, NY sasha@google.com Kathleen R. McKeown and Owen C. Rambow Department of Computer Science Center for Computational Learning Systems Columbia University { kathy,rambow } @cs.columbia.edu Abstract We report results of experiments which build and refine models of rhetorical- semantic relations such as Cause and Con- trast. We adopt the approach of Marcu and Echihabi (2002), using a small set of patterns to build relation models, and ex- tend their work by refining the training and classification process using parame- ter optimization, topic segmentation and syntactic parsing. Using human-annotated and automatically-extracted test sets, we find that each of these techniques results in improved relation classification accuracy. 1 Introduction Relations such as Cause and Contrast, which we call rhetorical-semantic relations (RSRs), may be sig- naled in text by cue phrases like because or how- ever which join clauses or sentences and explicitly express the relation of constituents which they con- nect (Example 1). In other cases the relation may be implicitly expressed (2). 1 Example 1 Because of the recent accounting scan- dals, there have been a spate of executive resigna- tions. Example 2 The administration was once again be- set by scandal. After several key resignations ... 1 The authors would like to thank the four anonymous re- viewers for helpful comments. This work was supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR0011-06-C-0023. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA. The first author performed most of the research reported in this paper while at Columbia University. In this paper, we examine the problem of detect- ing such relations when they are not explicitly sig- naled. We draw on and extend the work of Marcu and Echihabi (2002). Our baseline model directly implements Marcu and Echihabis approach, opti- mizing a set of basic parameters such as smoothing weights, vocabulary size and stoplisting. We then focus on improving the quality of the automatically- mined training examples, using topic segmenta- tion and syntactic heuristics to filter out training instances which may be wholly or partially in- valid. We find that the parameter optimization and segmentation-based filtering techniques achieve sig- nificant improvements in classification performance....
View Full Document

This note was uploaded on 03/06/2012 for the course CIS 630 taught by Professor Cis630 during the Spring '08 term at UPenn.

Page1 / 8

N07-1054 - Proceedings of NAACL HLT 2007 , pages 428435,...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online