Unformatted text preview: al Sec. 21.3 High ­level scheme Base set   Extract from the web a base set of pages that could be good hubs or authori*es.   From these, iden*fy a small set of top hub and authority pages;   Given text query (say browser), use a text index to get all pages containing browser. → itera*ve algorithm.   Call this the root set of pages.   Add in any page that either   points to a page in the root set, or   is pointed to by a page in the root set.   Call this the base set. 6 Introduc)on to Informa)on Retrieval S ec. 21.3 Visualiza*on Introduc)on to Informa)on Retrieval S ec. 21.3 Dis*lling hubs and authori*es   Compute, for each page x in the base set, a hub score h(x) and an authority score a(x).   Ini*alize: for all x, h(x)←1; a(x) ←1;   Itera*vely update all h(x), a(x); Key   Aler itera*ons Root set   output pages with highest h() scores as top hubs   highest a() s...
This document was uploaded on 02/26/2014.

