lect16

lect16 - Link Analysis Ranking How do search engines decide...

Info iconThis preview shows pages 1–18. Sign up to view the full content.

View Full Document Right Arrow Icon
Link Analysis Ranking
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it?
Background image of page 2
Naïve ranking of query results Given query q Rank the web pages p in the index based on sim(p,q) Scenarios where this is not such a good idea?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Why Link Analysis? First generation search engines view documents as flat text files could not cope with size, spamming, user needs Example: Honda website, keywords: automobile manufacturer Second generation search engines Ranking becomes critical use of Web specific data: Link Analysis shift from relevance to authoritativeness a success story for the network analysis
Background image of page 4
Link Analysis: Intuition A link from page p to page q denotes endorsement page p considers page q an authority on a subject mine the web graph of recommendations assign an authority value to every page
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Link Analysis Ranking Algorithms Start with a collection of web pages Extract the underlying hyperlink graph Run the LAR algorithm on the graph Output: an authority weight for each node w w w w w
Background image of page 6
Algorithm input Query dependent : rank a small subset of pages related to a specific query HITS (Kleinberg 98) was proposed as query dependent Query independent : rank the whole Web PageRank (Brin and Page 98) was proposed as query independent
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Query-dependent LAR Given a query q , find a subset of web pages S that are related to S Rank the pages in S based on some ranking criterion
Background image of page 8
Query-dependent input Root Set
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Query-dependent input Root Set IN OUT
Background image of page 10
Query dependent input Root Set IN OUT
Background image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Query dependent input Root Set IN OUT Base Set
Background image of page 12
Properties of a good seed set S S is relatively small. S is rich in relevant pages. S contains most (or many) of the strongest authorities.
Background image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
How to construct a good seed set S For query q first collect the t highest- ranked pages for q from a text-based search engine to form set Γ S = Γ Add to S all the pages pointing to Γ Add to S all the pages that pages from Γ point to
Background image of page 14
Link Filtering Navigational links: serve the purpose of moving within a site (or to related sites) www.espn.com www.espn.com/nba www.yahoo.com www.yahoo.it www.espn.com www.msn.com Filter out navigational links same domain name same IP address
Background image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
How do we rank the pages in seed set S? In degree? Intuition Problems
Background image of page 16
Hubs and Authorities [K98] Authority is not necessarily transferred directly between authorities Pages have double identity hub identity authority identity Good hubs point to good authorities Good authorities are pointed by good hubs hubs authorities
Background image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 18
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 51

lect16 - Link Analysis Ranking How do search engines decide...

This preview shows document pages 1 - 18. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online