p2p-search

p2p-search - Search, Pollution, and Poisoning in P2P...

Info iconThis preview shows pages 1–7. Sign up to view the full content.

View Full Document Right Arrow Icon
Search, Pollution, and Poisoning in P2P File-Sharing Networks John Chuang School of Information Management and Systems University of California at Berkeley chuang@sims.berkeley.edu http://p2pecon.berkeley.edu/ Guest Lecture for IS290: Search Engines, October 3 2005
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
John Chuang 2 P2P Search and WWW Search ± Similarities ± Scale and scope: Kazaa alone has 4 petabytes of data shared by 3 million peers (2004) ± Differences ± Highly dynamic: peers come and go ± Short session durations Æ high content volatility
Background image of page 2
John Chuang 3 P2P File-Sharing Networks ± 1 st generation: centralized index ± e.g., Napster ± 2 nd generation: decentralized indices ± e.g., Gnutella v0.4, Freenet ± 3 rd generation: hierarchical ± e.g., FastTrack (KaZaA, Grokster, Morpheus), eDonkey2000, Gnutella v0.6 ± 4 th generation?: structured topologies ± e.g., Overnet using Kademlia DHT ± Note: BitTorrent has no built-in search mechanism; various darknet proposals for small-scale “F2F” networks
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
John Chuang 4 Napster ± Maintains a centralized index that maps files to machines ± How to find a file ± Query the index system Æ return a list of peers that store the requested file ± Transfer the file directly from peer(s) ± Advantage: ± Simplicity: easy to implement sophisticated search engines on top of the index system ± Disadvantage: ± Single point of failure A B C D E F m1 m2 m3 m4 m5 m6 m1 A m2 B m3 C m4 D m5 E m6 F E? m5 E? E Slide adapted from Ion Stoica, Nicolas Christin
Background image of page 4
John Chuang 5 Gnutella (v0.4) ± Flood the request ± How to find a file: ± Send request to all neighbors ± Neighbors recursively propagate the request ± Eventually a machine that has the file receives the request, and it sends back the answer ± Advantages: ± Totally decentralized, highly robust ± Disadvantages: ± The entire network can be swamped with a request ± Can be alleviated using TTLs, but can then fail to locate files (and still high resource usage) A B C D E F m1 m2 m3 m4 m5 m6 E? E? E? E? E Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… Slide adapted from Ion Stoica, Nicolas Christin
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 Hierarchical networks ± Use two-level hierarchy ± Some nodes are elected as “super nodes” or “ultra-peers” ± Each ultra-peer serves as centralized index for a portion of the network ± If an ultra-peer does not know where to find an item, query is forwarded to other ultra-peers ± Advantage: ± Reduce the amount of network traffic compared to “naïve” flooding ± Disadvantage: ± Ultra-peers vulnerable to attacks ± Potential convergence problems when ultra-peers leave abruptly ± Used in FastTrack (KaZaA, Grokster, Morpheus), eDonkey2000, Gnutella v0.6 A B C D E F m1 m2 m3 m4 F? F?
Background image of page 6
Image of page 7
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 34

p2p-search - Search, Pollution, and Poisoning in P2P...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online