sigops.mon.LGN.revised - Reliable On-Demand Management...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Reliable On-Demand Management Operations for Large-scale Distributed Applications * Jin Liang, Indranil Gupta and Klara Nahrstedt Department of Computer Science University of Illinois at Urbana-Champaign { jinliang, indy, klara } @cs.uiuc.edu ABSTRACT This paper argues for attention to, and proposes a novel direction to solving, instant monitoring and management tasks for large-scale distributed applications running across hundreds of hosts. We present the MON (Management Overlay Networks) approach 1 , which uses a novel concept called on-demand overlays , in order to support instant com- mands such as queries and software pushes. On-demand overlays are built on-the-fly and probabilistically, by leverag- ing weakly-consistent gossip-style membership information underneath. Thus, they are lightweight in terms of mem- ory, computation, and bandwidth. We augment on-demand overlays with several notions of application-specified relia- bility, and show how MON detects and adheres to these. MON is available atop PlanetLab, and we present experi- mental results. We conclude with a series of promising open problems in this direction. Categories and Subject Descriptors C.2.4 [ Computer Systems Organization ]: Communica- tion.Distributed Systems Keywords Monitoring, Instant Commands, On-demand Overlays, Re- liability 1. INTRODUCTION Several wide-area and large-scale distributed computing systems have emerged in the recent few years, e.g., util- ity Grids [7, 30], experimental Grids [1], and lately Planet- Lab [21]. More importantly, these large scale distributed in- * This research was supported in part by NSF CAREER grant CNS-0448246 and in part by NSF ITR grant CMS- 0427089. 1 This paper is an extended version of our workshop paper [12], and includes additional contributions on augmenting on-demand overlays with reliability. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 200X ACM X-XXXXX-XX-X/XX/XX ... $ 5.00. frastructures have become increasingly popular for running distributed applications such as content distribution [2, 5, 18, 34], application-level DNS [19, 22], cooperative caches [8, 13], publish-subscribe systems [15], and large-scale ex- periments, e.g., [32, 33]. While today many management tools are available for the management of computing infrastructures themselves, e.g., [7, 14, 26, 35, 37, 38] and they are very useful to the in- frastructures system administrators, there is a scarcity of significant tools that application developers and managers can use for managing their applications on such systems [10, 20]. Cluster-management tools allow querying of resource variables associated directly with the infrastructure, and...
View Full Document

This note was uploaded on 12/08/2011 for the course CS 525 taught by Professor Gupta during the Spring '08 term at University of Illinois, Urbana Champaign.

Page1 / 7

sigops.mon.LGN.revised - Reliable On-Demand Management...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online