Unformatted text preview: edge starting at a node: If the node at the end of the edge already appears on the list of “visited” nodes or it is already in the Queue, then do nothing with that edge Otherwise, append the node at the end of the edge to the end of the Queue Define the following three terms: Freshness The reciprocal of the time elapsed between successive visits to a page. Coverage The percentage of pages a particular engine indexes. Coherence The overall extent to which a copy of the web corresponds to the web itself. How does the choice of seeds affect coverage? When using certain seeds as the starting point, not all pages may be visited. If a search engine increases its coverage, what is likely to happen to the freshness of the pages? Freshness is halved. According to Web Dragons, what are the four parts of the web? Describe each. New Archipelago, Milgram’s Continent, Corporate Continent, Terra Incognita Consider the following directed graph: Page P1 P2 P3 P4 P5 P6 P7 Probability .2 .15 .05 .2 .15 .1 .15 Answer the following questions based on the information on the previous page. What is the probability of a random surfer ending on P6 after the first iteration, if it never jumps to a random page? 0.2*(1/3)+0.05*(1/3)+0.1*(1/4)+0.15*(1/4) Now assume that a random surfer has a 10% chance to jump to a page (with the target page chosen at random) What is the probability of a random surfer ending on P6 after the first iteration? (0.2*(1/3)+0.05*(1/3)+0.1*(1/4)+0.15*(1/4))*0.9+0.1*(1/7) Using P3 as the seed, in what order would the pages be visited when using a breadth‐first searching method? Pg3, Pg1, Pg4, Pg6, Pg2, Pg7, Pg5 Ranking The following...
