4 Pages

M92-1013

Course: M 92, Fall 2009
School: UPenn
Rating:
 
 
 
 
 

Word Count: 2073

Document Preview

ALEMBIC MITRE-Bedford : MUC-4 Test Results and Analysi s John Aberdeen, John Burger, Dennis Connolly, Susan Roberts, & Marc Vilai n aberdeen john decon @ mitre. org suzi mbv The MITRE Corporatio n 202 Burlington Road Bedford, MA 01730 PRELIMINARIE S This note embodies our analyses of the performance of the ALEMBIC system in the MUC-4 evaluation task . These analyses have provided us with a reasonably...

Register Now

Unformatted Document Excerpt

Coursehero >> Pennsylvania >> UPenn >> M 92

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
ALEMBIC MITRE-Bedford : MUC-4 Test Results and Analysi s John Aberdeen, John Burger, Dennis Connolly, Susan Roberts, & Marc Vilai n aberdeen john decon @ mitre. org suzi mbv The MITRE Corporatio n 202 Burlington Road Bedford, MA 01730 PRELIMINARIE S This note embodies our analyses of the performance of the ALEMBIC system in the MUC-4 evaluation task . These analyses have provided us with a reasonably good understanding of the principal factors contributing to the system's correct responses and to its errors . This understanding is based in part on interpretations of the performance measures provided by the MUC-4 scoring software; in addition, we performed a number of qualitative and quantitative investigations into linguistic aspects of the messages that underly the system ' s performance . It should be noted however, that ALEMBIC iS still in very early stages of development, and that the analyse s we give here should be taken as just presenting a snapshot of the system's performance. In the weeks since the MUC-4 evaluation runs, the system has of course remained under development, and its performance score s have improved steadily . One consequence of our system's relative youth is that it embodies many oportunitie s for improvement, and even minor implementational tweaks can yield significant performance gains . OVERALL PERFORMANCE MEASURE S Looking first at our system's overall performance, the following table reproduces the f-measures for our run s on TST3 and TST4 . P&R TST3 T5T4 9 .6 13.75 2P&R 8 .57 11 .22 P&2 R 10.91 17.74 Table 1 : Overall f-score s These are clearly fairly humble scores, but we offer them for consideration with a certain measure of pride . They represent the very first results of a text understanding project that was barely begun six months prior to the evaluation runs, as fielded by a group that had no prior experience with the MUC data extraction task . As one might note, raw f-measures are a fairly coarse performance statistic ; despite this, some trends are apparent . First, our system seems better at recall than at precision, an issue which we will address below . Second, the system scored uniformly better on TST4 than on TST3, which is in line with the general observatio n among MUC-4 participants that TST4 is the easier of the two test sets. 116 RECALL MEASURE S Turning to a more detailed analysis of our recall measures, the principal determinant of our overall recal l performance is rather clear. That is, we only attempted to fill about half of all possible template slots, thos e for the incident and perpetrator. Although we had slot-filling strategies prepared for the remaining slots, the y were never incorporated into the system prior to the evaluation runswe just simply ran out of time to do so . It is illuminating, however, to consider ALEMBIC'S performance on the slots that the system actually did fill . The more interesting of these slots are those that the system filled by meaningfully extracting information fro m the free text ; their slotwise recall scores are shown in Table 2 below . Slots that do not appear in the tabl e were simply not filled at all, or were only filled by default strategies (more on this later) . T5T3 inc-date inc-loc inc-type inc-instr-id inc-instr-type perp-ind-id perp-org-id 3 23 10 30 T5T4 35 16 53 27 16 22 40 48 40 Table 2 : Recall scores on meaningfully filled slots A quick glance at the table reveals that our scores ranged fairly widely . On string fills, ALEMBIC obtained scores ranging for TST3 from 8 (instrument ID) to 40 (perpetrator organization ID) ; for set fills the range was 3 (instrument type) to 30 (incident type) . Similar patterns held for TST4, but with higher individual slot scores , reflecting the fact that this was the easier of the two test sets . As an estimate of the average recall for th e slots in Table 2 , we calculated a restricted overall recall score (based only on these slots) of approximatel y 20 for TST3 and 34 for TST4. 1 On a slot-by-slot basis, the following qualitative observations apply . Incident date : We derived this slot from the free text, and only used the dateline as a last recourse in case we failed to identify any date phrases . The date grammar we used for MUC-4 treats date phrases as functors, which were often left unattached due to the fragmentar y nature of our parses . This made it harder to actually locate temporal phrases when they did not appear as modifiers of events, resulting in a fair number of invocations of the heuristi c fallback strategy of using the dateline . Incident location : Recall errors for this slot were due in good part to locational modifiers no t being attached to events, as well as to a number of infelicities in the locational knowledg e representation . Among the more amusing: the lexical item Bogota maps to a number of possible locations, but the one that was picked by default was the Bogota Air Force Base . Incident type : This was our most accurate set-fill slot . ALEMBIC derives the filler of this slot from the heads of violent events ; missing cases are due in part to gaps in the lexicon . 'Not too much should be made of these scores . They admittedly exclude slots that are easy to fill using default values , but they also don't include slots that are hard to fill, i .e., the target slots . 117 Incident instrument in : We expected to get better recall scores for this slot. Eyeballing the actual fillers that ALEMBIC produced, part of the problem was grammatical incompleteness . For example, "a powerful dynamite charge" ended up only being parsed as "a powerfu l dynamite," due to a grammar bug involving noun-noun modification . Since we attempted to use full noun phrases to fill string slots, we ended up being penalized for cases where we ha d nearly parsed the complete instrument phrase, but where our fragmentary filler failed to b e matched by the scoring program . Incident instrument type : This slot was only filled when an instrument ID filler was obtained . We never implemented implicit fills for this slot, i .e ., fills that could be derived from verb s such as shoot even if no gun is ever mentioned . Perpetrator individual ID : As mentioned in the system overview, our strategy for filling thi s slot was heuristic . In case the violent event associated with the template lacked an agentiv e argument, plausible candidates were looked for elsewhere in the neighboring text . Once again, the fragmentary nature of the parses led to the heuristic strategy fallback bein g invoked fairly often, with very mixed results . Perpetrator organization ID : We obtained comparatievely high recall scores for this slot . This is a relatively easy slot to fill, however, because likely perpetrator organizations ar e readily identified . The remaining incident and perpetrator slots ended up being filled by default values . As a result, although we obtained some reasonable recall scores for these individual slots, these scores are of little real interest . ISSUES WITH PRECISIO N Our precision error rate is largely accounted for by overly eager template generation . As we note in th e system description, the version of ALEMBIC fielded at MUC-4 generates a template for every seemingly distinc t violent event . Our strategy for distinguishing such events from each other was heavily dependent on ou r reference resolution module, which turned out to be quite unreliable, and as a result generated multipl e (nearly) identical templates for the same event . Consequently, we ended up with overall low precision an d high overgeneration scores, as demonstrated by Table 3 . Precision TST3 T5T4 8 10 Overgeneratio n 90 87 Table 3 : Overall precision scores (all templates row) The effects of this template generation strategy on our precision scores were fairly dramatic . We ended up actually making relatively few incorrect fills for those templates that were mapped by the scoring program . Specifically, ignoring spurious templates (as in the matched/missing row) we obtain precision scores of 72 an d 75 for TST3 and TST4 respectively . However, because our spurious fills ended up outnumbering our incorrec t fills by 20 to 1, our official scores from the all templates row were considerably weaker . Aside from this principal cause of our precision errors, another significant factor is that ALEMBIC failed to filter out templates that corresponded to military clashes between guerrilla groups and the armed forces . We failed to incorporate such a filter largely because to do so presupposes filling some slots that we were simpl y leaving blank. Anecdotally, among the worst offenders of this sort was one message for which we generate d two legitimate templates and ten templates corresponding to military clashes . 11 8 A CLOSER LOOK AT SYNTAX AND REFERENC E A common thread to both our recall and precision problems is fragmentation of the parses . With respect t o recall, fragmentation lead to the syntactic arguments of event verbs (and of their nominalizations) being lef t unattached; this caused the system to fall back frequently on unreliable backup strategies . In addition, th e fragmentation confused our reference resolution module, because it introduced far too many top-level nou n phrases or event verbs, each of which was potentially a candidate for reference resolution . This caused ALEMBIC to miss co-references needed to fill slots, and it also led to the system's poor ability to distinguis h identical events on the basis of reference resolution . We performed a number of post-hoc analyses to estimate the relative weight on fragmentation of variou s linguistic factors for which the MUC-4 version of ALEMBIC had incomplete grammatical coverage . These turned out to include a traditional and unsurprising cast of linguistic characters : coordination, PP attachment, noun noun modification, subgrammars for title, date, location, etc . None of these factors seems particularl y dominant, however ; they all need to be eventually addressed in some way (linguistically principled o r otherwise) . Because fragmentation played such a compromising role with respect to our reference resolution module , we also performed a number of quantitative analyses to clarify the nature of the problem . To begin with, w e looked at the set of candidates that were considered when resolving an anaphoric expression . Many of these candidates were spurious ; they had b...

Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

UPenn - M - 92
PARAMAX SYSTEMS CORPORATION : MUC-4 TEST RESULTS AND ANALYSI SCarl Weir and Barry SilkParamax Systems Corporation Valley Forge Lab s Paoli, Pennsylvani a weir@prc .unisys.com (215) 648-236 9INTRODUCTIO NThe data extraction system submitted b
UPenn - M - 92
GE NLTOOLSET : MUC-4 TEST RESULTS AND ANALYSI SLisa Rau, George Krupka, and Paul Jacob sArtificial Intelligence Laborator y GE Research and Developmen t Schenectady, NY 12301 US A E-mail : rauCkrd .ge .com Phone : (518 ;) 387 - 5059 andIra Sider
UPenn - H - 92
BBN BYBLOS and HARC February 1992 ATIS Benchmark ResultsFrancis Kubala, Chris Barry, Madeleine Bates, Robert Bobrow, Pascale Fung, Robert Ingria, John Makhoul, Long Nguyen, Richard Schwartz, David StallardBBN Systems and Technologies Cambridge MA 0
UPenn - P - 82
THE REPRESENTATION OF INCONSISTENT INFORMATION IN A DYNAMIC MODEL-THEORETIC SEMANTICSDouglas B. Moran Department of Computer Science Oregon State University Corvallis, Oregon 97331ABSTRACT Model-theoretic semantics provides a computationally attr
UPenn - P - 04
Subsentential Translation Memory for Computer Assisted Writing and TranslationJian-Cheng WuDepartment of Computer Science National Tsing Hua University 101, Kuangfu Road, Hsinchu, 300, Taiwan, ROC D928322@oz.nthu.edu.twThomas C. ChuangDepartment
UPenn - H - 94
ROBUSTCONTINUOUSSPEECHRECOGNITIONJohn Makhoul and Richard Schwartz makhoul@bbn.com, schwartz@bbn.comBBN Systems and Technologies 70 Fawcett St. Cambridge, MA 021381. PROJECT GOALSThe primary objective of this basic research program is to
UPenn - P - 92
ON THE INTONATION OF MONO- AND DI-SYLLABIC WORDS WITHIN THE DISCOURSE FRAMEWORK OF CONVERSATIONAL GAMES Jacqueline C. Kowtko* Human Communication Research Centre University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW SCOTLAND Internet: J.Kowtko@
UPenn - HCMG - 05
COST-BENEFIT AND COSTEFFECTIVENESS ANALYSIS HCMG 901/301 Lecture #2 A Examples of CE AnalysisDaniel PolskySeptember 15, 2005Abdominal Aortic Aneurysm (AAA) An abdominal aortic aneurysm (AAA) is an abnormal expansion of the abdominal portion of t
UPenn - ISPOR - 05
COMPARISON OF COSTS AND EFFECTSCOST-EFFECTIVENESS RATIOS OR NMB?!Cost-effectiveness ratios)C / )Q !Net monetary benefit100N = 39 9275 % of studiesRc)Q - )C where Rc equals willingness to pay WERE COSTS AND EFFECTS COMPARED?5025 8
UPenn - ISPOR - 03
GLOSSARY OF STATA COMMANDS USED IN PROGRAMS append: appends data from saved file (i.e., adds observations) to dataset in active memory bsample: low level bootstrap algorithm; it draws a random sample with replacement from the dataset in active memory
UPenn - SCT - 06
Designing Economic Evaluations in Randomized TrialsSociety for Clinical Trials 27th Annual Meeting May 2006 Henry Glick and Jalpa Doshi www.uphs.upenn.edu/dgimhsrCost-Effectiveness History Traditionally used decision analytic models decision tre
UPenn - ISPOR - 06
Statistical Considerations in Economic Evaluations ISPOR 11th Annual International Meeting May 21, 2006 Agenda 1:00 - 1:10 1:10 - 2:10 2:10 - 3:50 Introductions Economic Evaluation in Randomized Trials, Henry Glick Evaluating Patient Level Costs, Jal
UPenn - ISPOR - 06
Designing Economic Evaluations in Randomized TrialsStatistical Considerations in Economic EvaluationsISPOR 11th Annual International Meeting May 2006 Henry Glick, Daniel Polsky, Jalpa Doshi www.uphs.upenn.edu/dgimhsrGood Value for the Cost Cutti
UPenn - ISPOR - 06
ANALYZING TREATMENT COSTS IN RANDOMIZED TRIALS Jalpa Doshi, Henry Glick, and Daniel Polsky University of Pennsylvania Health System Statistical Considerations in Economic Evaluations ISPOR 11th Annual International Meeting May 21, 2006 www.uphs.upenn
UPenn - MDM - 05
EVALUATING TREATMENT COSTS IN RANDOMIZED TRIALS Jalpa Doshi, Henry Glick, and Daniel Polsky University of Pennsylvania Health System Economic Assessment in Clinical Trials SMDM 27th ANNUAL MEETING October 22, 2005 www.uphs.upenn.edu/dgimhsr OUTLINE
UPenn - MDM - 05
EVALUATING TREATMENT COSTS IN RANDOMIZED TRIALS Jalpa Doshi, Henry Glick, and Daniel Polsky University of Pennsylvania Health System Economic Assessment in Clinical Trials SMDM 27th ANNUAL MEETING October 22, 2005 www.uphs.upenn.edu/dgimhsrAppendix
UPenn - ISPOR - 07
ANALYZING TREATMENT COSTS IN RANDOMIZED TRIALSJalpa Doshi, PhD, Henry Glick, PhD, and Daniel Polsky, PhD University of Pennsylvania School of Medicine www.uphs.upenn.edu/dgimhsrOutline Univariate Analysis Statistical Tests General Advice Multi
UPenn - CARMA - 04
COST EFFECTIVENESS ANALYSIS AND NEW METHODOLOGICAL APPROACHES Henry Glick University of Pennsylvania www.uphs.upenn.edu/dgimhsrDEMONSTRATING THAT A TECHNOLOGY IS "GOOD VALUE FOR THE COST" (I)!10-15 years ago, most likely would have supported th
UPenn - CIT - 597
JUnitApr 10, 2009Test suitesObviously you have to test your code to get it working in the first place You can do ad hoc testing (running whatever tests occur to you at the moment), or You can build a test suite (a thorough set of tests that
UPenn - CIT - 597
Aspect-Oriented ProgrammingApr 10, 2009Programming paradigmsProcedural programming Executing a set of commands in a given sequence Fortran, C, Cobol Evaluating a function defined in terms of other functions Lisp, ML, OCaml Proving a theorem
UPenn - CIT - 591
JUnitApr 10, 2009Test suitesObviously you have to test your code to get it working in the first place You can do ad hoc testing (running whatever tests occur to you at the moment), or You can build a test suite (a thorough set of tests that
UPenn - CIT - 597
ServletsApr 10, 2009ServersA server is a computer that responds to requests from a clientTypical requests: provide a web page, upload or download a file, send emailA server is also the software that responds to these requests; a clien
UPenn - P - 414
Physics 414/521 Lecture 1Professor Joseph Kroll Dr. Jose Vithayatil University of PennsylvaniaOutline Standard units Discussion of uncertainties statistical systematic reminder about propagation of uncertainties Mean & VarianceWe do not u
UPenn - P - 364
Checklist for Physics 364 Lab #3: ac Circuits=1. Verifying that Kirchhoff's laws are satisfied for complex impedance o They should have schematic of the circuit and a description of where they placed the oscilloscope probes. The schematic
UPenn - P - 364
yytkrty7@2ee&te l { d q h h d f d q h p t o o l h 3~qx3Q7Qgrwrj{zQ7Q`rTyh`w%ct(3rxywj!g!d h% r!gx!YCSkrQwj!rwx%mr!gQj`QSlYjQx!%`rh d f f s on l h d } d f { d q h fd s q s
UPenn - P - 364
(Wi}s7@2t& z A " E A H B 6 $ " E 4 A E d x d h g xp h z z y d d g }mifSUsn%&5!bjuWYtjTs3tS2rmI!msfVe!j u ex xy u p il d y u ~ dx g l d i d d x d uYkIk!Qt!WfxijesWgiQ!t5YVWjlYim!5Ybfe3fj!jW(vflmn
UPenn - P - 364
Physics 364: Some General Remarks about the Laboratory 17 Sep. 2001 Prof. Kroll Please read this information before coming to the rst laboratory session on 17 September 2001. Information about the laboratories (schedule, lab write-ups, etc.) can be f
UPenn - P - 364
Physics 364 Fall 2001 Laboratories Prof. Kroll_ | | | | | | | | | | | | |BARBOUR,SAMUEL 10103593 | | | | | | | | | | | | |_|_|_|_|_|
UPenn - P - 364
Max,For the first three homework problems, I would like you to check oneof the problems carefully (specified below) and then see if theyattempted the other problems. The problems I would like to see checkedcarefully areProblem Set #1: Problem
UPenn - P - 364
Kirchhos Voltage Law (KVL): the sum of voltages around any closed loop is zero, that is, voltage gains equal voltage losses. Kirchhos Current Law (KCL): the sum of currents into a node is zero, that is, the sum of currents into the node equals the su
UPenn - P - 364
Dear Class,If you are receiving this message, it is because you are currently enrolledto take Physics 364 or 521 this Fall semester. I am writing to you to provideyou with some important information about the class. There is a web page for the
UPenn - DPF - 2006
Observation of B0s B0s OscillationsThe CDF CollaborationJoseph Kroll University of Pennsylvania1st St. Ocean City, NJ, Feb. 7, 2003, H2O 350 FDPF Waikiki, HI 2 Nov 2006Results presented today are contained in two papers:) on ti ra o ab 6 )
UPenn - DOE - 2005
Penn CDF B Physics OverviewJoseph Kroll Penn DOE Site Visit 8 August 2005Topics Bs & B0 Flavor oscillations ( ms) leadership Kroll, Jones, Oldeman flavor tagging Jones, Usynin, Kroll lifetime resolution Heijboer trigger monitoring & innov
UPenn - DOE - 2005
CDF OverviewJoseph Kroll Penn DOE Site Visit 8 9 August 2005Context of this presentationPast 4 years CDF II has moved from construction & commissioning maintaining & analyzing Penn CDF Group has made major contributions to CDF II & made a major
UPenn - UCC - 1
1-105. Territorial Application of the Act; Parties Power to ChooseApplicable Law and Judicial Forum(a) Unless the law determining the rights and obligations of parties with respect to anyaspect of a transaction governed by this Act has been se
UPenn - N - 03
Statistical Phrase-Based Translation Proceedingsof HLT-NAACL 2003 Main Papers , pp. 48-54 Edmonton, May-June 2003Philipp Koehn, Franz Josef Och, Daniel Marcu Information Sciences Institute Department of Computer Science University of Southern Cal
UPenn - C - 96
H M M - B a s e d Word Alignment in Statistical TranslationStephan Vogel Hermann Ney Christoph Tillmann L e h r s t u h l ffir I n f o r m a t i k V, R W T H A a c h e n D-52056 Aachen, Germany {vogel, n e y , t illmann}@inf ormat ik. rwth-aachen, d
UPenn - P - 03
Chunk-based Statistical TranslationTaro Watanabe, Eiichiro Sumita and Hiroshi G. Okuno {taro.watanabe, eiichiro.sumita}@atr.co.jp ATR Spoken Language Translation Department of Intelligence Science Research Laboratories and Technology 2-2-2 Hikarida
UPenn - C - 00
ABL: Alignment-Based LearningMenno van Zaanen School of C o m p u t e r S t u d i e s University of Leeds LS2 9 J T L(~eds UK menno@scs, l e e d s , a c . ukAbstract This ])al)er introdu(:es a new tyl)e of grammar learning algorithm, iilst)ired l)
UPenn - P - 00
An Information-Theory-Based Feature Type Analysis for the Modelling of Statistical ParsingSUI Zhifang , ZHAO Jun , Dekai WU Hong Kong University of Science & Technology Department of Computer Science Human Language Technology Center Clear Water B
UPenn - P - 03
A spoken dialogue interface for TV operations based on data collected by using WOZ methodJun Yeun-Bae Goto Kim NHK STRL NHK STRL Human Science Human Science Tokyo 157-8510 Tokyo 157-8510 Japan Japangoto.j-fw @nhk.or.jp kimu.y-go @nhk.or.jpMasaru
UPenn - P - 03
Loosely Tree-Based Alignment for Machine TranslationDaniel Gildea University of Pennsylvania dgildea@cis.upenn.eduAbstractWe augment a model of translation based on re-ordering nodes in syntactic trees in order to allow alignments not conforming
UPenn - C - 90
Toward Memory-based TranslationSatoshi S A T O and Ma.koto N A G A O Dept. of Electrical Engineering, K y o t o University Y o s h i d a - h o n m a c h i , Sa.kyo, K.yoto, 606, Ja.pan sa.to@kuee.kyoto-u.ac.jpAbstractAn essential problem of examp
UPenn - J - 93
Machine Translation: A Knowledge-Based Approach Sergei Nirenburg, Jaime Carbonell, Masaru Tomita, and Kenneth Goodman(Carnegie Mellon University) San Mateo, CA: Morgan Kaufmann Publishers, 1992, xiv + 258 pp. Hardbound, ISBN 1-55860-128-7, $39.95T
UPenn - C - 00
Automatic Corpus-Based Thai Word Extraction with the C4.5 Learning AlgorithmVIRACH SORNLERTLAMVANICH, TANAPONG POTIPITI AND THATSANEE CHAROENPORN National Electronics and Computer Technology Centel, National Science and Technology Development Agency
UPenn - C - 90
Reversible Unification Based M a c h m . FranslatlonGertjan van Noord OTS RUU Trans 10 3,512 JK Utrecht Valmoord~hutruu59.BH~netMarch 28, 1990Abstract[n this paper it will be shown how unification g r a m m a r s can be used to build a reversib
UPenn - C - 00
Chart-Based Transfer Rule Application in Machine TranslationAdam MeyersNew York University meyers@cs.nyu.edu M i c h i k o Kosaka Monlnouth University kosaka@monmouth.eduR a l p h GrishInanNew York University grishman@cs.nyu.eduAbstract35"ans
UPenn - P - 99
Corpus-Based Identification of Non-Anaphoric N o u n PhrasesD a v i d L. B e a n and E l l e n R i l o f fD e p a r t m e n t of C o m p u t e r Science University of U t a h Salt Lake City, U t a h 84112 {bean,riloff}@cs.utah.eduAbstract Corefer
UPenn - P - 90
ZERO MORPHEMES IN UNIFICATION-BASED COMBINATORY CATEGORIAL GRAMMAR Chinatsu Aone The University of Texas at Austin & MCC 3500 West Balcones Center Dr. Austin, TX 78759 (aone@mcc.com) ABSTRACT In this paper, we report on our use of zero morphemes in U
UPenn - P - 96
A N e w Statistical Parser Based on B i g r a m Lexical D e p e n d e n c i e sCollins* Dept. of Computer and Information Science University of Pennsylvania P h i l a d e l p h i a , P A , 19104, U . S . A . mcollins@gradient, cis.upenn, eduMichae
UPenn - P - 99
Designing a Task-Based Evaluation M e t h o d o l o g y for a Spoken Machine Translation S y s t e mKavita Thomas L a n g u a g e Technologies I n s t i t u t e Carnegie Mellon University 5000 Forbes Avenue P i t t s b u r g h , PA 15213, USAkavita
UPenn - P - 03
An Ontology-based Semantic Tagger for IE systemNarj` s Boufaden e Department of Computer Science Universit de Montr al e e Quebec, H3C 3J7 Canada boufaden@iro.umontreal.caAbstractIn this paper, we present a method for the semantic tagging of word
UPenn - C - 96
NL Domain Explanations in Knowledge Based MATGalia Angelova, Kalina Bontcheva 1Bulgarian Academy of Sciences, Linguistic Modelling Laboratory A c a d . G, B o n c h e v Str. 2 5 A , 1113 S o f i a , B u l g a r i a , { galja,kalina} @ b g c i c t .
UPenn - P - 03
Deverbal Compound Noun Analysis Based on Lexical Conceptual StructureTeruo Koyama Koichi Takeuchi Kyo Kageura Human and Social Information Research Division National Institute of Informatics 2-1-2 Hitotsubashi, Chiyodaku, Tokyo 101-8430, Japan koich
UPenn - D - 07
Large-Scale Named Entity Disambiguation Based on Wikipedia DataSilviu CucerzanMicrosoft Research One Microsoft Way, Redmond, WA 98052, USA silviu@microsoft.comAbstractThis paper presents a large-scale system for the recognition and semantic disa
UPenn - P - 01
A Syntax-based Statistical Translation ModelKenji Yamada and Kevin Knight Information Sciences Institute University of Southern California 4676 Admiralty Way, Suite 1001 Marina del Rey, CA 90292 kyamada,knight @isi.edu AbstractWe present a syntax-b
UPenn - C - 02
Semantics-based Representation for Multimodal Interpretation in Conversational SystemsJoyce ChaiIBM T. J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532, USA{jchai@us.ibm.com}Abstract To support context-based multimodal interpretati
UPenn - A - 92
A Simple Rule-Based Part of Speech TaggerEric Brill * D e p a r t m e n t of C o m p u t e r S c i e n c e University of Pennsylvania P h i l a d e l p h i a , P e n n s y l v a n i a 19104U.S.A.brill@unagi.cis.upenn.edu Abstract Automatic part o
UPenn - P - 05
A Hierarchical Phrase-Based Model for Statistical Machine TranslationDavid Chiang Institute for Advanced Computer Studies (UMIACS) University of Maryland, College Park, MD 20742, USA dchiang@umiacs.umd.eduAbstractWe present a statistical phrase-b
UPenn - P - 06
Investigations on Event-Based SummarizationMingli Wu Department of Computing The Hong Kong Polytechnic University Kowloon, Hong Kong csmlwu@comp.polyu.edu.hkAbstractWe investigate independent and relevant event-based extractive mutli-document su
UPenn - N - 06
Thai Grapheme-Based Speech RecognitionPaisarn Charoenpornsawat, Sanjika Hewavitharana, Tanja SchultzInteractive Systems Laboratories, School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 {paisarn, sanjika, tanja}@cs.cmu.eduA