TopicSpecificPageRank

TopicSpecificPageRank - CS345 Data Mining Page Rank...

Info icon This preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
CS345 Data Mining Page Rank Variants
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Review Page Rank box3 Web graph encoded by matrix M square6 N £ N matrix (N = number of web pages) square6 M ij = 1/|O(j)| iff there is a link from j to i square6 M ij = 0 otherwise square6 O(j) = set of pages node i links to box3 Define matrix A as follows square6 A ij = β M ij + (1- β )/N, where 0< β <1 square6 1- β is the “tax” discussed in prior lecture box3 Page rank r is first eigenvector of A square6 Ar = r
Image of page 2
Random walk interpretation box3 At time 0, pick a page on the web uniformly at random to start the walk box3 Suppose at time t, we are at page j box3 At time t+1 square6 With probability β , pick a page uniformly at random from O(j) and walk to it square6 With probability 1- β , pick a page on the web uniformly at random and teleport into it box3 Page rank of page p = “steady state” probability that at any given time, the random walker is at page p
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Many random walkers box3 Alternative, equivalent model box3 Imagine a large number M of independent, identical random walkers (M À N) box3 At any point in time, let M(p) be the number of random walkers at page p box3 The page rank of p is the fraction of random walkers that are expected to be at page p i.e., E [M(p)]/M.
Image of page 4
Problems with page rank box3 Measures generic popularity of a page square6 Biased against topic-specific authorities square6 Ambiguous queries e.g., jaguar square6 This lecture box3 Link spam square6 Creating artificial link topographies in order to boost page rank square6 Next lecture
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Topic-Specific Page Rank box3 Instead of generic popularity, can we measure popularity within a topic? square6 E.g., computer science, health box3 Bias the random walk square6 When the random walker teleports, he picks a page from a set S of web pages square6 S contains only pages that are relevant to the topic square6 E.g., Open Directory (DMOZ) pages for a given topic ( www.dmoz.org ) box3 Correspong to each teleport set S, we get a different rank vector r S
Image of page 6
Matrix formulation box3 A ij = β M ij + (1- β )/|S| if i 2 S box3 A ij = β M ij otherwise box3 Show that A is stochastic box3 We have weighted all pages in the teleport set S equally square6 Could also assign different weights to them
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern