PageRank-2

PageRank-2 - Link Analysis Algorithms CS345 Data Mining...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman Link Analysis Algorithms ± Page Rank ± Hubs and Authorities ± Topic-Specific Page Rank ± Spam Detection Algorithms ± Other interesting topics we won’t cover ² Detecting duplicates and mirrors ² Mining for communities ² Classification ² Spectral clustering Ranking web pages ± Web pages are not equally “important” ² www.joe-schmoe.com v www.stanford.edu ± Inlinks as votes ² www.stanford.edu has 23,400 inlinks ² www.joe-schmoe.com has 1 inlink ± Are all inlinks equal? ² Recursive question! Simple recursive formulation ± Each link’s vote is proportional to the importance of its source page ± If page P with importance x has n outlinks, each link gets x/n votes Simple “flow” model The web in 1839 Yahoo M’soft Amazon y a m y/2 y/2 a/2 a/2 m y = y /2 + a /2 a = /2 + m = /2 Solving the flow equations ± 3 equations, 3 unknowns, no constants ² No unique solution ² All solutions equivalent modulo scale factor ± Additional constraint forces uniqueness ² y+a+m = 1 ² y = 2/5, a = 2/5, m = 1/5 ± Gaussian elimination method works for small examples, but we need a better method for large graphs
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Matrix formulation ± Matrix M has one row and one column for each web page ± Suppose page j has n outlinks ² If j i, then M ij =1/n ² Else M ij =0 ± M is a column stochastic matrix ² Columns sum to 1 ± Suppose r is a vector with one entry per web page ² r i is the importance score of page i ² Call it the rank vector Example Suppose page j links to 3 pages, including i i j M rr = i 1/3 Eigenvector formulation ± The flow equations can be written r = Mr ± So the rank vector is an eigenvector of the stochastic web matrix ² In fact, its first or principal eigenvector, with corresponding eigenvalue 1 Example Yahoo M’soft Amazon y 1/2 1/2 0 a 1/2 0 1 m 0 1/2 0 y a m y = y /2 + a /2 a = /2 + m = /2 r = Mr y 1/2 1/2 0 y a = 1/2 0 1 a m 0 1/2 0 m Power Iteration method ± Simple iterative scheme (aka relaxation ) ± Suppose there are N web pages ± Initialize: r 0 = [1/N,….,1/N] T ± Iterate: r k+1 = Mr k ± Stop when | r k+1 - r
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 03/04/2012.

Page1 / 6

PageRank-2 - Link Analysis Algorithms CS345 Data Mining...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online