roy - Airavat: Security and Privacy for MapReduce Indrajit...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Airavat: Security and Privacy for MapReduce Indrajit Roy Srinath T.V. Setty Ann Kilzer Vitaly Shmatikov Emmett Witchel The University of Texas at Austin { indrajit, srinath, akilzer, shmat, witchel } @cs.utexas.edu Abstract We present Airavat, a MapReduce-based system which provides strong security and privacy guarantees for dis- tributed computations on sensitive data. Airavat is a novel integration of mandatory access control and differ- ential privacy. Data providers control the security policy for their sensitive data, including a mathematical bound on potential privacy violations. Users without security expertise can perform computations on the data, but Aira- vat confines these computations, preventing information leakage beyond the data provider’s policy. Our prototype implementation demonstrates the flexi- bility of Airavat on several case studies. The prototype is efficient, with run times on Amazon’s cloud computing infrastructure within 32% of a MapReduce system with no security. 1 Introduction Cloud computing involves large-scale, distributed com- putations on data from multiple sources. The promise of cloud computing is based in part on its envisioned ubiq- uity: Internet users will contribute their individual data and obtain useful services from the cloud. For example, targeted advertisements can be created by mining a user’s clickstream, while health-care applications of the future may use an individual’s DNA sequence to tailor drugs and personalized medical treatments. Cloud computing will fulfill this vision only if it supports flexible compu- tations while guaranteeing security and privacy for the in- put data. To balance the competing goals of a permissive programming model and the need to prevent information leaks, the untrusted code should be confined [30]. Contributors of data to cloud-based computations face several threats to their privacy. For example, consider a medical patient who is deciding whether to participate in a large health-care study. First, she may be concerned that a careless or malicious application operating on her data as part of the study may expose it—for instance, by writing it into a world-readable file which will then be indexed by a search engine. Second, she may be con- cerned that even if all computations are done correctly and securely, the result itself, e.g. , aggregate health-care statistics computed as part of the study, may leak sensi- tive information about her personal medical record. Traditional approaches to data privacy are based on syntactic anonymization, i.e. , removal of “personally identifiable information” such as names, addresses, and Social Security numbers. Unfortunately, anonymiza- tion does not provide meaningful privacy guarantees....
View Full Document

This note was uploaded on 12/08/2011 for the course CS 525 taught by Professor Gupta during the Spring '08 term at University of Illinois, Urbana Champaign.

Page1 / 16

roy - Airavat: Security and Privacy for MapReduce Indrajit...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online