This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: FAWN: A Fast Array of Wimpy Nodes David G. Andersen 1 , Jason Franklin 1 , Michael Kaminsky 2 , Amar Phanishayee 1 , Lawrence Tan 1 , Vijay Vasudevan 1 1 Carnegie Mellon University, 2 Intel Labs ABSTRACT This paper presents a new cluster architecture for low-power data-intensive computing. FAWN couples low-power em- bedded CPUs to small amounts of local flash storage, and balances computation and I/O capabilities to enable efficient, massively parallel access to data. The key contributions of this paper are the principles of the FAWN architecture and the design and implementation of FAWN-KV—a consistent, replicated, highly available, and high-performance key-value storage system built on a FAWN prototype. Our design centers around purely log-structured datastores that provide the basis for high performance on flash storage, as well as for replication and consistency obtained using chain replication on a consistent hashing ring. Our eval- uation demonstrates that FAWN clusters can handle roughly 350 key-value queries per Joule of energy—two orders of magnitude more than a disk-based system. Categories and Subject Descriptors D.4.7 [ Operating Systems ]: Organization and Design— Dis- tributed Systems ; D.4.2 [ Operating Systems ]: Storage Man- agement; D.4.5 [ Operating Systems ]: Reliability— Fault- tolerance ; D.4.8 [ Operating Systems ]: Performance— Mea- surements Keywords Design, Energy Efficiency, Performance, Measurement, Clus- ter Computing, Flash Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SOSP’09 , October 11-14, 2009, Big Sky, MT, USA. Copyright 2009 ACM 978-1-60558-752-3/09/10 ...$10.00 1 Introduction Large-scale data-intensive applications, such as high- performance key-value storage systems, are growing in both size and importance; they now are critical parts of major In- ternet services such as Amazon (Dynamo [ 10 ]), LinkedIn (Voldemort ), and Facebook ( memcached ). The workloads these systems support share several charac- teristics: they are I/O, not computation, intensive, requiring random access over large datasets; they are massively parallel, with thousands of concurrent, mostly-independent operations; their high load requires large clusters to support them; and the size of objects stored is typically small, e.g., 1 KB values for thumbnail images, 100s of bytes for wall posts, twitter messages, etc....
View Full Document
- Spring '08
- Fawn, FAWN-DS