10-pagerank

272011 jure leskovec stanford c246 mining massive

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: re within the group) A group of pages is a spider trap if there are no links from within the group to the outside of the group Random surfer gets trapped And eventually spider traps absorb all importance 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 27 Power iteration: Y! Set ri=1 ri=∑j Mij∙rj And iterate A MS Y! y a= m 2/7/2011 1 1 1 5/8 3/8 2 Jure Leskovec, Stanford C246: Mining Massive Datasets … ½ 0 ½ 0 0 MS ¾ ½ 7/4 ½ A 1 ½ 3/2 MS Y! Example: A 0 ½ 1 0 0 3 28 The Google solution for spider traps At each time step, the random surfer has two options: With probability β, follow a link at random With probability 1-β, jump to some page uniformly at random Common values for β are in the range 0.8 to 0.9 Surfer will teleport out of spider trap within a few time steps 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 29 0.2*1/3 Yahoo 1/2 0.8*1/2 1/2 0.8*1/2 y 1/2 0.8* 1/2 0 y 1/3 + 0.2* 1/3 1/3 0.2*1/3 0.2*1/3 Amazon y y 1/2 a 1/2 m0 M’soft 1/2 1/2 0 0.8 1/2 0 0 0 1/2 1 1/3 1/3 1/3 + 0.2 1/3 1/3 1/3 1/3 1/3 1/3 y 7/15 7/15 1/15 a 7/15 1/15 1/15 m 1/15 7/15 13/15 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 30 1/2 1/2 0 0.8 1/2 0 0 0 1/2 1 Yahoo Amazon y a= m 2/7/2011 y 7/15 7/15 1/15 a 7/15 1/15 1/15 m 1/15 7/15 13/15 M’soft 1 1 1 1.00 0.60 1.40 1/3 1/3 1/3 + 0.2 1/3 1/3 1/3 1/3 1/3 1/3 0.84 0.60 1.56 0.776 0.536 . . . 1.688 Jure Leskovec, Stanford C246: Mining Massive Datasets 7/11 5/11 21/11 31 Some pages are “dead ends” (have no out-links) Y! Such pages cause importance to leak out A...
View Full Document

This document was uploaded on 02/26/2014 for the course CS 246 at Stanford.

Ask a homework question - tutors are online