p801 - A Personalized Search Engine Based on Web-Snippet...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering Paolo Ferragina Dipartimento di Informatica, Pisa ferragina@di.unipi.it Antonio Gulli Dipartimento di Informatica, Pisa gulli@di.unipi.it ABSTRACT In this paper we propose a hierarchical clustering engine, called SnakeT , that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hi- erarchy of labeled folders. The hierarchy offers a comple- mentary view to the flat-ranked list of results returned by current search engines. Users can navigate through the hier- archy driven by their search needs. This is especially useful for informative, polysemous and poor queries. SnakeT is the first complete and open-source system in the literature that offers both hierarchical clustering and folder labeling with variable-length sentences. We exten- sively test SnakeT against all available web-snippet cluster- ing engines, and show that it achieves efficiency and efficacy performance close to the best known engine Vivisimo.com . Recently, personalized search engines have been intro- duced with the aim of improving search results by focusing on the users, rather than on their submitted queries. We show how to plug SnakeT on top of any (un-personalized) search engine in order to obtain a form of personalization that is fully adaptive, privacy preserving, scalable, and non intrusive for underlying search engines. SnakeT is available at http://snaket.di.unipi.it/ . Categories and Subject Descriptors H.3 [ Information Storage And Retrieval ]: Content Analysis and Indexing, Information Search and Retrieval, Online Information Services; I.5.3 [ Text Processing ]: Clustering General Terms Algorithms, Design, Experimentation, Measurement Keywords Web Snippets Clustering, Search Engines, Information Ex- traction, New Search Applications and Interfaces, Personal- ized Web Ranking 1. INTRODUCTION Web-snippet clustering is an innovative approach to help users in searching the web [24]. It consists of clustering the Copyright is held by the International World Wide Web Conference Com- mittee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2005 , May 10-14, 2005, Chiba, Japan. ACM 1-59593-051-5/05/0005. snippets 1 returned by a (meta-)search engine into a hier- archy of folders which are labeled with variable-length sen- tences. The labels should capture the theme of the snip- pets (and thus, of the corresponding web pages) contained into their associated folders. This labeled hierarchy offers a complementary view to the flat-ranked list of results re- turned by current search engines. Users can exploit it by navigating through the hierarchy of labeled folders, driven by their search needs. This technique is useful for informa- tive [5], polysemous or poor queries....
View Full Document

This note was uploaded on 02/10/2012 for the course CSE 5800 taught by Professor Staff during the Fall '09 term at FIT.

Page1 / 10

p801 - A Personalized Search Engine Based on Web-Snippet...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online