This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Adaptive Data Placement for Wide-Area Sensing Services Suman Nath * Microsoft Research [email protected] Phillip B. Gibbons Intel Research Pittsburgh [email protected] Srinivasan Seshan Carnegie Mellon University [email protected] Abstract Wide-area sensing services enable users to query data collected from multitudes of widely distributed sensors. In this paper, we consider the novel distributed database workload characteristics of these services, and present IDP, an online, adaptive data placement and replication system tailored to this workload. Given a hierarchical database, IDP automatically partitions it among a set of networked hosts, and replicates portions of it. IDP makes decisions based on measurements of access local- ity within the database, read and write load for individual objects within the database, proximity between queriers and potential replicas, and total load on hosts participat- ing in the database. Our evaluation of IDP under real and synthetic workloads, including flash crowds of queriers, demonstrates that in comparison with previously-studied replica placement techniques, IDP reduces average re- sponse times for user queries by up to a factor of 3 and re- duces network traffic for queries, updates, and data move- ments by up to an order of magnitude. 1 Introduction Emerging wide-area sensing services [18,26,27] promise to instrument our world in great detail and produce vast amounts of data. For example, scientists already use such services to make observations of natural phenomena over large geographic regions [2, 5]; retailers, such as Wal- mart , plan to monitor their inventory using RFID tags; and network operators (ISPs) monitor their traffic using a number of software sensors . A key challenge that these services face is managing their data and making it easily queriable by users. An effective means for address- ing this challenge is to store the vast quantity of data in a wide-area distributed database, which efficiently handles both updates from geographically dispersed sensors and queries from users anywhere in the world . Like traditional distributed databases, sensing service databases must carefully replicate and place data in order to ensure efficient operation. Replication is necessary for avoiding hot spots and failures within the system, while careful data placement is required for minimizing network traffic and data access latency. Although replication and data placement have been extensively studied in the * Work done while the first author was a PhD student at CMU and an intern at Intel Research Pittsburgh. context of many wide-area systems [10, 12, 24, 32, 36, 37, 40, 41], existing designs are ill-suited to the unique workload properties of sensing services. For example, unlike traditional distributed databases, a sensing service database typically has a hierarchical organization and a write-dominated workload. Moreover, the workload is expected to be highly dynamic and to exhibit...
View Full Document
This note was uploaded on 11/12/2011 for the course CE 726 taught by Professor Staf during the Spring '11 term at SUNY Buffalo.
- Spring '11