10 - SecureMR A Service Integrity Assurance Framework for...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Department of Computer Science, North Carolina State University Raleigh, North Carolina, United States { wwei5,jdu } @ncsu.edu, { gu,yu } @csc.ncsu.edu Abstract —MapReduce has become increasingly popular as a powerful parallel data processing model. To deploy MapReduce as a data processing service over open systems such as service oriented architecture, cloud computing, and volunteer computing, we must provide necessary security mechanisms to protect the integrity of MapReduce data processing services. In this paper, we present SecureMR, a practical service integrity assurance framework for MapReduce. SecureMR consists of five security components, which provide a set of practical security mechanisms that not only ensure MapReduce service integrity as well as to prevent replay and Denial of Service (DoS) attacks, but also preserve the simplicity, applicability and scalability of MapRe- duce. We have implemented a prototype of SecureMR based on Hadoop, an open source MapReduce implementation. Our analytical study and experimental results show that SecureMR can ensure data processing service integrity while imposing low performance overhead. I. INTRODUCTION MapReduce is a parallel data processing model, proposed by Google to simplify parallel data processing on large clus- ters [1]. Recently, many organizations have adopted the model of MapReduce, and developed their own implementations of MapReduce, such as Google MapReduce [1] and Yahoo’s Hadoop [2], as well as thousands of MapReduce applications. Moreover, MapReduce has been adopted by many academic researchers for data processing in different research areas, such as high end computing [3], data intensive scientific analysis [4], large scale semantic annotation [5] and machine learning [6]. Current data processing systems using MapReduce are mainly running on clusters belonging to a single administration domain. As open systems, such as Service-Oriented Architec- ture (SOA) [7], [8], Could Computing [9] and Volunteer Com- puting [10], [11], increasingly emerge as promising platforms for cross-domain resource and service integration, MapReduce deployed over open systems will become an attractive solution for large-scale cost-effective data processing services. As a forerunner in this area, Amazon deploys MapReduce as a web service using Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (Amazon S3). It provides a public data processing service for researchers, data analysts, and developers to efficiently and cost-effectively process vast amounts of data [12]. However, in open systems, besides communication security threats such as eavesdropping attacks, replay attacks, and Denial of Service (DoS) attacks, MapRe- duce faces a data processing service integrity issue since service providers in open systems may come from different administration domains that are not always trustworthy.administration domains that are not always trustworthy....
View Full Document

This note was uploaded on 01/27/2012 for the course CS 600 taught by Professor Smith,r during the Spring '08 term at Alabama.

Page1 / 10

10 - SecureMR A Service Integrity Assurance Framework for...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online