{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

PageRank-4 - CS345 Data Mining Link Analysis Algorithms...

Info icon This preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Link Analysis Algorithms box3 Page Rank box3 Hubs and Authorities box3 Topic-Specific Page Rank box3 Spam Detection Algorithms box3 Other interesting topics we won’t cover square6 Detecting duplicates and mirrors square6 Mining for communities
Image of page 2
Ranking web pages box3 Web pages are not equally “important” square6 www.joe-schmoe.com v www.stanford.edu box3 Inlinks as votes square6 www.stanford.edu has 23,400 inlinks square6 www.joe-schmoe.com has 1 inlink box3 Are all inlinks equal? square6 Recursive question!
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Simple recursive formulation box3 Each link’s vote is proportional to the importance of its source page box3 If page P with importance x has n outlinks, each link gets x/n votes box3 Page P ’s own importance is the sum of the votes on its inlinks
Image of page 4
Simple “flow” model The web in 1839 Yahoo M’soft Amazon y a m y/2 y/2 a/2 a/2 m y = y /2 + a /2 a = y /2 + m m = a /2
Image of page 5

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Solving the flow equations box3 3 equations, 3 unknowns, no constants square6 No unique solution square6 All solutions equivalent modulo scale factor box3 Additional constraint forces uniqueness square6 y+a+m = 1 square6 y = 2/5, a = 2/5, m = 1/5 box3 Gaussian elimination method works for small examples, but we need a better method for large graphs
Image of page 6
Matrix formulation box3 Matrix M has one row and one column for each web page box3 Suppose page j has n outlinks square6 If j ! i, then M ij =1/n square6 Else M ij =0 box3 M is a column stochastic matrix square6 Columns sum to 1 box3 Suppose r is a vector with one entry per web page square6 r i is the importance score of page i square6 Call it the rank vector square6 | r | = 1
Image of page 7

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Example Suppose page j links to 3 pages, including i i j M r r = i 1/3
Image of page 8
Eigenvector formulation box3 The flow equations can be written r = Mr box3 So the rank vector is an eigenvector of the stochastic web matrix square6 In fact, its first or principal eigenvector, with corresponding eigenvalue 1
Image of page 9

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Example Yahoo M’soft Amazon y 1/2 1/2 0 a 1/2 0 1 m 0 1/2 0 y a m y = y /2 + a /2 a = y /2 + m m = a /2 r = Mr y 1/2 1/2 0 y a = 1/2 0 1 a m 0 1/2 0 m
Image of page 10
Power Iteration method box3 Simple iterative scheme (aka relaxation ) box3 Suppose there are N web pages box3 Initialize: r 0 = [1/N,….,1/N] T box3 Iterate: r k+1 = Mr k box3 Stop when | r k+1 - r k | 1 < ε square6 | x | 1 = 1 i N |x i | is the L 1 norm square6 Can use any other vector norm e.g., Euclidean
Image of page 11

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 12
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern