This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: CS224W - Social and Information Network Analysis Fall 2010 Assignment 2 Due 11:59pm November 4, 2010 General Instructions You are required to write the name of your collaborators for this assignment on your solution report. You are also required to submit the source code used to obtain your solutions along with your report. Your write-ups are to be original, and all external references must be duly cited. The names beside each question point to the TA dealing with the corresponding question. It would help you to also plan your visit to our office hours accordingly. Submission Instructions We expect you to be able to access your Dropbox folder at http://coursework.stanford.edu . Whichever files you want us to see are to be archived (or zipped) together into a single file and placed into this Dropbox folder. This file should be named of the format: < SUNetId > < Last-Name > < First- Name > HW1 : where these are your last and first names respectively. Within this file, we require the following at least : a. Your solution report. You can submit it as a .pdf or .doc. This report should be named as : < SUNetId > < Last-Name > < First-Name > HW1 Ans (.doc or .pdf) b. Any source code that you use towards obtaining your results stated in your solution report. Also include the mention of any tools whichever you use towards your solution(s). Questions Question 1. Power Laws and Preferential Attachment (30 points - Jen) Part 1: Empirical Power Laws Generate a dataset of 100,000 values following a power-law distribution: h ( x ) x- with exponent =2.5. Refer to the paper Power-law distributions in empirical data by Clauset, Shalizi and Newman for how to gen- erate random numbers from a power-law distribution (note: reading this paper is very helpful in answering this question!). Since the probability density diverges as x 0, you will need to bound your distribution, so let x min = 1. (a) Plot on a log-log scale P ( X = x ), the probability distribution function (pdf). Your plot can be a normalized histogram of the data you generated. To check if you generated your data correctly, you can optionally include the actual probability density function as a line on the same plot. (b) Plot on a log-log scale P ( X < x ), the cumulative distribution function (cdf). You can plot this as a normalized cumulative histogram of your generated data. As above, you can optionally include the actual cumulative density function as a line on the same plot. (c) Make a plot of P ( X x ). This is called the complementary cumulative distribution function (ccdf). What do you notice about the ccdf in relation to the other two graphs? Show mathematically that if a distribution is a power law and the exponent of the pdf is then the exponent of the ccdf is equal to -1....
View Full Document
- Fall '09