{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

vrable - Cumulus Filesystem Backup to the Cloud Michael...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
USENIX Association 7th USENIX Conference on File and Storage Technologies 225 Cumulus: Filesystem Backup to the Cloud Michael Vrable , Stefan Savage , and Geoffrey M. Voelker Department of Computer Science and Engineering University of California, San Diego Abstract In this paper we describe Cumulus, a system for effi- ciently implementing filesystem backups over the Inter- net. Cumulus is specifically designed under a thin cloud assumption—that the remote datacenter storing the back- ups does not provide any special backup services, but only provides a least-common-denominator storage in- terface (i.e., get and put of complete files). Cumulus aggregates data from small files for remote storage, and uses LFS-inspired segment cleaning to maintain storage efficiency. Cumulus also efficiently represents incremen- tal changes, including edits to large files. While Cumulus can use virtually any storage service, we show that its ef- ficiency is comparable to integrated approaches. 1 Introduction It has become increasingly popular to talk of “cloud com- puting” as the next infrastructure for hosting data and de- ploying software and services. Not surprisingly, there are a wide range of different architectures that fall un- der the umbrella of this vague-sounding term, ranging from highly integrated and focused (e.g., Software As A Service offerings such as Salesforce.com) to decom- posed and abstract (e.g., utility computing such as Ama- zon’s EC2/S3). Towards the former end of the spectrum, complex logic is bundled together with abstract resources at a datacenter to provide a highly specific service— potentially offering greater performance and efficiency through integration, but also reducing flexibility and in- creasing the cost to switch providers. At the other end of the spectrum, datacenter-based infrastructure providers offer minimal interfaces to very abstract resources (e.g., “store file”), making portability and provider switching easy, but potentially incurring additional overheads from the lack of server-side application integration. In this paper, we explore this thin-cloud vs. thick- cloud trade-off in the context of a very simple applica- tion: filesystem backup. Backup is a particularly attrac- tive application for outsourcing to the cloud because it is relatively simple, the growth of disk capacity relative to tape capacity has created an efficiency and cost inflection point, and the cloud offers easy off-site storage, always a key concern for backup. For end users there are few backup solutions that are both trivial and reliable (espe- cially against disasters such as fire or flood), and ubiq- uitous broadband now provides sufficient bandwidth re- sources to offload the application. For small to mid-sized businesses, backup is rarely part of critical business pro- cesses and yet is sufficiently complex to “get right” that it can consume significant IT resources. Finally, larger en- terprises benefit from backing up to the cloud to provide a business continuity hedge against site disasters.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}