DataGrids - A Taxonomy of Data Grids for Distributed Data...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: A Taxonomy of Data Grids for Distributed Data Sharing, Management, and Processing SRIKUMAR VENUGOPAL, RAJKUMAR BUYYA, AND KOTAGIRI RAMAMOHANARAO University of Melbourne, Australia Data Grids have been adopted as the next generation platform by many scientific communities that need to share, access, transport, process, and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this article, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks, and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation, and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Categories and Subject Descriptors: H.3.4 [ Information Storage and Retrieval ]: Systems and Software Distributed systems ; C.2.4 [ Computer-Communication Networks ]: Distributed Systems Client/server ; distributed applications ; J.2 [ Physical Sciences and Engineering ]; J.3 [ Life and Medical Sciences ] General Terms: Design, Management Additional Key Words and Phrases: Grid computing, data-intensive applications, virtual organizations, replica management 1. INTRODUCTION The next generation of scientific applications in domains as diverse as high energy physics, molecular modeling, and earth sciences involve the production of large datasets from simulations or from large-scale experiments. Analysis of these datasets and their dissemination among researchers located over a wide geographic area requires high ca- pacity resources such as supercomputers, high bandwidth networks, and mass storage systems. Collectively, these large scale applications have come to be known as part of This work is partially supported through the Australian Research Council (ARC) Discovery Project grant and Storage Technology Corporation sponsorship of Grid Fellowship. Authors address: R. Buyya, Grid Computing and Distributed Sytems Laboratory, Department of Computer Science and Software Engineering, University of Melbourne, VIC 3010, Australia; email: rbuyya@unimelb. edu.au. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation....
View Full Document

Page1 / 53

DataGrids - A Taxonomy of Data Grids for Distributed Data...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online