{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

p801 - A Personalized Search Engine Based on Web-Snippet...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
A Personalized Search Engine Based on Web-Snippet Hierarchical Clustering Paolo Ferragina Dipartimento di Informatica, Pisa [email protected] Antonio Gulli Dipartimento di Informatica, Pisa [email protected] ABSTRACT In this paper we propose a hierarchical clustering engine, called SnakeT , that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hi- erarchy of labeled folders. The hierarchy offers a comple- mentary view to the flat-ranked list of results returned by current search engines. Users can navigate through the hier- archy driven by their search needs. This is especially useful for informative, polysemous and poor queries. SnakeT is the first complete and open-source system in the literature that offers both hierarchical clustering and folder labeling with variable-length sentences. We exten- sively test SnakeT against all available web-snippet cluster- ing engines, and show that it achieves efficiency and efficacy performance close to the best known engine Vivisimo.com . Recently, personalized search engines have been intro- duced with the aim of improving search results by focusing on the users, rather than on their submitted queries. We show how to plug SnakeT on top of any (un-personalized) search engine in order to obtain a form of personalization that is fully adaptive, privacy preserving, scalable, and non intrusive for underlying search engines. SnakeT is available at http://snaket.di.unipi.it/ . Categories and Subject Descriptors H.3 [ Information Storage And Retrieval ]: Content Analysis and Indexing, Information Search and Retrieval, Online Information Services; I.5.3 [ Text Processing ]: Clustering General Terms Algorithms, Design, Experimentation, Measurement Keywords Web Snippets Clustering, Search Engines, Information Ex- traction, New Search Applications and Interfaces, Personal- ized Web Ranking 1. INTRODUCTION Web-snippet clustering is an innovative approach to help users in searching the web [24]. It consists of clustering the Copyright is held by the International World Wide Web Conference Com- mittee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2005 , May 10-14, 2005, Chiba, Japan. ACM 1-59593-051-5/05/0005. snippets 1 returned by a (meta-)search engine into a hier- archy of folders which are labeled with variable-length sen- tences. The labels should capture the “theme” of the snip- pets (and thus, of the corresponding web pages) contained into their associated folders. This labeled hierarchy offers a complementary view to the flat-ranked list of results re- turned by current search engines. Users can exploit it by navigating through the hierarchy of labeled folders, driven by their search needs. This technique is useful for informa- tive [5], polysemous or poor queries.
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern