This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: PeerReview: Practical Accountability for Distributed Systems Andreas Haeberlen , Petr Kuznetsov , and Peter Druschel Max Planck Institute for Software Systems, Rice University ABSTRACT We describe PeerReview, a system that provides account- ability in distributed systems. PeerReview ensures that Byzantine faults whose effects are observed by a correct node are eventually detected and irrefutably linked to a faulty node. At the same time, PeerReview ensures that a cor- rect node can always defend itself against false accusations. These guarantees are particularly important for systems that span multiple administrative domains, which may not trust each other. PeerReview works by maintaining a secure record of the messages sent and received by each node. The record is used to automatically detect when a nodes behavior de- viates from that of a given reference implementation, thus exposing faulty nodes. PeerReview is widely applicable: it only requires that a correct nodes actions are deterministic, that nodes can sign messages, and that each node is periodi- cally checked by a correct node. We demonstrate that Peer- Review is practical by applying it to three different types of distributed systems: a network filesystem, a peer-to-peer system, and an overlay multicast system. Categories and Subject Descriptors C.2.4 [ Computer Systems Organization ]: Computer- Communication Networks Distributed Systems ; D.4.5 [ Software ]: Operating Systems Reliability General Terms Algorithms, Design, Reliability, Security Keywords Accountability, fault detection, distributed systems, Byzan- tine faults Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SOSP07, October 1417, 2007, Stevenson, Washington, USA. Copyright 2007 ACM 978-1-59593-591-5/07/0010 ...$5.00. 1. INTRODUCTION Nodes in distributed systems can fail for many reasons: a node can suffer a hardware or software failure; an attacker can compromise a node; or a nodes operator can deliber- ately tamper with its software. Moreover, faulty nodes are not uncommon . At large scale, it is increasingly likely that some nodes are accidentally misconfigured or have been compromised as a result of unpatched security vulnerabili- ties. In systems that span multiple administrative domains, the lack of central administration tends to aggravate these prob- lems. Moreover, multiple trust domains pose the additional threat of deliberate manipulation by node operators with dif- ferent interests. Examples of systems with multiple admin- istrative domains are network services such as DNS, NTP, NNTP and SMTP, federated information systems, compu-...
View Full Document
- Spring '08