19_4 - Small-World File-Sharing Communities Adriana...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Small-World File-Sharing Communities Adriana Iamnitchi, Matei Ripeanu, Ian Foster Department of Computer Science The University of Chicago Chicago, IL 60637 { anda, matei, foster } @cs.uchicago.edu Abstract — Web caches, content distribution networks, peer-to-peer fle sharing networks, distributed fle systems, and data grids all have in common that they involve a community oF users who generate requests For shared data. In each case, overall system perFormance can be improved signifcantly iF we can frst identiFy and then exploit interesting structure within a community’s access patterns. To this end, we propose a novel perspective on fle sharing that considers the relationships that Form among users based on the fles in which they are interested. We propose a new structure that captures common user interests in data—the data-sharing graph — and justiFy its utility with studies on three data-distribution systems: a high-energy physics collaboration, the Web, and the Kazaa peer-to-peer network. We fnd small-world patterns in the data-sharing graphs oF all three communities. We analyze these graphs and propose some probable causes For these emergent small-world patterns. The signifcance oF small- world patterns is twoFold: it provides a rigorous support to intuition and, perhaps most importantly, it suggests ways to design mechanisms that exploit these naturally emerging patterns. I. INTRODUCTION Large-scale, Internet-connected distributed systems are notoriously dif±cult to manage. In a resource-sharing environment such as a peer-to-peer system that con- nects hundreds of thousands of computers in an ad-hoc network, intermittent resource participation, large and variable scale, and high failure rates are challenges that often impose performance tradeoffs. Thus, existing P2P ±le-location mechanisms favor speci±c requirements: in Gnutella, the emphasis is on accommodating highly volatile peers and on fast ±le retrieval, with no guarantees that ±les will always be located. In Freenet [1], the em- phasis is on ensuring anonymity. In contrast, distributed hash tables such as CAN [2], Chord [3], Pastry [4], and Tapestry [5] guarantee that ±les will always be located, but do not support wildcard searches. One way to optimize these tradeoffs is to understand user behavior. In this paper we analyze user behavior in three ±le-sharing communities in an attempt to get inspi- ration for designing ef±cient mechanisms for large-scale, dynamic, self-organizing resource-sharing communities. We look at these communities in a novel way: we study the relationships that form among users based on the data in which they are interested. We capture and quantify these relationships by modeling the community as a data-sharing graph . To this end, we propose a new structure that captures common user interests in data (Section III) and justify its utility with studies on three data-distribution systems (Section IV): a high- energy physics collaboration, the Web, and the Kazaa peer-to-peer network. We ±nd small-world patterns in the data-sharing graphs of all three communities (SectionV).
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/08/2011 for the course CS 525 taught by Professor Gupta during the Spring '08 term at University of Illinois, Urbana Champaign.

Page1 / 12

19_4 - Small-World File-Sharing Communities Adriana...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online