Mining-Anchor-Text

Mining-Anchor-Text - Mining Anchor Text for Query Refinement Reiner Kraft and Jason Zien IBM Almaden Research Center Mark Strohmaier Problem

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
Mining Anchor Text for Query Refinement Reiner Kraft and Jason Zien IBM Almaden Research Center Mark Strohmaier
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Problem Motivation 23% of search queries are single-term Expanding the query can lead to more accurate searches Previous studies indicate that anchor text is statistically similar to search queries Can this similarity be exploited to improve search queries?
Background image of page 2
What is anchor text? <a href=”this is the website”> This is the anchor text </a> Destination pages can have multiple links pointing to them Collections of anchor text can give a view of the destination page Naïve approach: Find links whose anchor text is similar to the query Return the links destination pages to the user
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Problems with naïve approach High term frequency is not directly related to page quality Repeated terms may lead to unnatural queries IDF is not necessarily relevant Anchor text may appear multiple times
Background image of page 4
Methods of Query Refinement Weighting the number of occurrences
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/06/2008 for the course CSE 450 taught by Professor Davison during the Spring '08 term at Lehigh University .

Page1 / 13

Mining-Anchor-Text - Mining Anchor Text for Query Refinement Reiner Kraft and Jason Zien IBM Almaden Research Center Mark Strohmaier Problem

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online