p2p-search

p2p-search - Search Pollution and Poisoning in P2P...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Search, Pollution, and Poisoning in P2P File-Sharing Networks John Chuang School of Information Management and Systems University of California at Berkeley [email protected] http://p2pecon.berkeley.edu/ Guest Lecture for IS290: Search Engines, October 3 2005
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
John Chuang 2 P2P Search and WWW Search ± Similarities ± Scale and scope: Kazaa alone has 4 petabytes of data shared by 3 million peers (2004) ± Differences ± Highly dynamic: peers come and go ± Short session durations Æ high content volatility
Background image of page 2
John Chuang 3 P2P File-Sharing Networks ± 1 st generation: centralized index ± e.g., Napster ± 2 nd generation: decentralized indices ± e.g., Gnutella v0.4, Freenet ± 3 rd generation: hierarchical ± e.g., FastTrack (KaZaA, Grokster, Morpheus), eDonkey2000, Gnutella v0.6 ± 4 th generation?: structured topologies ± e.g., Overnet using Kademlia DHT ± Note: BitTorrent has no built-in search mechanism; various darknet proposals for small-scale “F2F” networks
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
John Chuang 4 Napster ± Maintains a centralized index that maps files to machines ± How to find a file ± Query the index system Æ return a list of peers that store the requested file ± Transfer the file directly from peer(s) ± Advantage: ± Simplicity: easy to implement sophisticated search engines on top of the index system ± Disadvantage: ± Single point of failure A B C D E F m1 m2 m3 m4 m5 m6 m1 A m2 B m3 C m4 D m5 E m6 F E? m5 E? E Slide adapted from Ion Stoica, Nicolas Christin
Background image of page 4
John Chuang 5 Gnutella (v0.4) ± Flood the request ± How to find a file: ± Send request to all neighbors ± Neighbors recursively propagate the request ± Eventually a machine that has the file receives the request, and it sends back the answer ± Advantages: ± Totally decentralized, highly robust ± Disadvantages: ± The entire network can be swamped with a request ± Can be alleviated using TTLs, but can then fail to locate files (and still high resource usage) A B C D E F m1 m2 m3 m4 m5 m6 E? E? E? E? E Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… Slide adapted from Ion Stoica, Nicolas Christin
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Hierarchical networks ± Use two-level hierarchy ± Some nodes are elected as “super nodes” or “ultra-peers” ± Each ultra-peer serves as centralized index for a portion of the network ± If an ultra-peer does not know where to find an item, query is forwarded to other ultra-peers ± Advantage: ± Reduce the amount of network traffic compared to “naïve” flooding ± Disadvantage: ± Ultra-peers vulnerable to attacks ± Potential convergence problems when ultra-peers leave abruptly ± Used in FastTrack (KaZaA, Grokster, Morpheus), eDonkey2000, Gnutella v0.6 A B C D E F m1 m2 m3 m4 F? F?
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/03/2011 for the course SIMS 141 taught by Professor Staff during the Spring '11 term at Berkeley.

Page1 / 34

p2p-search - Search Pollution and Poisoning in P2P...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online