10.1.1.21.2374 - Towards Functional Benchmarking of...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Sheet1 Page 1 Towards Functional Benchmarking of Information Retrieval Models D.W. Song1 K.F. Wong1 P.D. Bruza 2 C.H. Cheng1 1 Department of Systems Engineering and Engineering Management Chinese University of Hong Kong, Shatin, N.T., Hong Kong {dwsong, kfwong, chcheng}@se.cuhk.edu.hk 2 School of Information System Queensland University of Technology, Brisbane, QLD, Australia bruza@icis.qut.edu.au Abstract To evaluate the effectiveness of information retrieval (IR) system, empirical methods (performance benchmarking) are widely used. Although they are useful to evaluate the performance of a system, they are unable to assess its underlying functionality. Recently researchers use logical approach to model IR properties so that inductive evaluation of IR could be performed. This approach is known as functional benchmarking. The aboutness framework has been used for this purpose. Aboutness based functional benchmarking is promising but yet ineffective due to the lack of a holistic view of the evaluation process. To overcome the ineffectiveness of the existing aboutness frameworks, we apply the idea of reasoning about function to IR and introduce a new strategy for IR functional benchmarking, which involves the application of a symbolic and axiomatic method to reason about IR functionality. This strategy consists of three parts, namely definition, modeling and evaluation. To facilitate the unified logical representation of an IR model in definition part and effective reasoning in the modeling part, a three- dimensional scale, which can identify the classes of essential IR functionality (representation, matching function, and transformation) is proposed in this paper. With this scale, the deficiencies of the existing aboutness frameworks could be overcome. 1. Introduction The evaluation of information retrieval (IR) systems centers on effectiveness. Traditionally, IR systems are evaluated and compared experimentally. The well-known evaluation measurements are precision and recall. Experimental retrieval evaluations are always conducted in laboratory environment, and based on test collections consisting of a corpus, a query set, and sets of relevance judgement (one for each query, a singleton set). Although many important results have been obtained, there are some
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Sheet1 Page 2 criticisms concerning the subjectivity in relevance judgement and limitation in corpus construction. Moreover, the experimental methods can not explain why Copyright Ó 1999, American Association for Artificial Intelligence. (www.aaai.org). All rights reserved. an IR system shows such performance. Thus, an objective evaluation approach is necessary. It should be independent of any given IR model, and be able to predict the underlying functionality of an IR system. In this way, the upper and lower bounds of the systems effectiveness could be approximated. To fill this gap, logic based inductive evaluation has
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 05/02/2011 for the course SWE 1120 taught by Professor Miles during the Spring '11 term at California Baptist University.

Page1 / 18

10.1.1.21.2374 - Towards Functional Benchmarking of...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online