10-pagerank

5 billion links altavista 272011 jure leskovec

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Jure Leskovec, Stanford C246: Mining Massive Datasets 7 Out-/In- Degree Distribution: Normalized count, pk pk: fraction of nodes with k out-/in-links Histogram of pk vs. k 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 8 Plot the same data on log-log axes: Normalized count, pk pk = β k −α log pk = log β − α log k 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 9 [Broder et al., ‘00] 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 10 Random network Degree distribution is Binomial, i.e., all nodes have similar degree 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets Power-law network Degrees are Power-law, i.e., heavily skewed 11 Web pages are not equally “important” www.joe-schmoe.com vs. www.stanford.edu Since there is large diversity in the connectivity of the webgraph we can rank the pages by the link structure 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 12 We will cover the following Link Analysis approaches to computing importances of nodes in a graph: 2/7/2011 Page Rank Hubs and Authorities (HITS) Topic-Specific (Personalized) Page Rank Spam Detection Algorithms Jure Leskovec, Stanford C246: Mining Massive Datasets 13 First try: Page is more important if it has more links In-coming links? Out-going links? Think of in-links as votes: www.stanford.edu has 23,400 inlinks www.joe-schmoe.com has 1 inlink Are all in-links are equal? Links from important pages count more Recursive question! 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive Datasets 14 Each link’s...
View Full Document

This document was uploaded on 02/26/2014 for the course CS 246 at Stanford.

Ask a homework question - tutors are online