10.1.1.140.7895

10.1.1.140.7895 - Anonymizing Bipartite Graph Data using...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Anonymizing Bipartite Graph Data using Safe Groupings Graham Cormode, Divesh Srivastava AT&T Labs–Research, Florham Park, NJ { graham,divesh } @research.att.com Ting Yu, Qing Zhang * North Carolina State University, Raleigh, NC { tyu,qzhang4 } @ncsu.edu ABSTRACT Private data often comes in the form of associations between entities, such as customers and products bought from a pharmacy, which are naturally represented in the form of a large, sparse bipartite graph. As with tabular data, it is desirable to be able to publish anonymized versions of such data, to allow others to perform ad hoc analysis of aggregate graph properties. However, existing tabular anonymization techniques do not give useful or meaningful results when applied to graphs: small changes or masking of the edge structure can radically change aggregate graph properties. We introduce a new family of anonymizations, for bipartite graph data, called ( k, ) -groupings. These groupings preserve the underlying graph structure perfectly, and instead anonymize the mapping from entities to nodes of the graph. We identify a class of “safe” ( k, ) -groupings that have provable guarantees to resist a variety of attacks, and show how to find such safe groupings. We perform experiments on real bipartite graph data to study the utility of the anonymized version, and the impact of publishing alternate groupings of the same graph data. Our experiments demonstrate that ( k, ) -groupings offer strong tradeoffs between privacy and utility. 1. INTRODUCTION Private data often arises in the form of associations between en- tities. An example is the products bought by customers at a phar- macy. The set of products being sold and their properties is public knowledge, and it may be no secret which customers visit a par- ticular pharmacy. However, the association between a particular individual and a particular medication is often considered sensi- tive, since it is indicative of a disease or health issue that they have. A large example of association data is the Netflix prize data set, released in 2006, which was anonymized based on an unspecified heuristic method [2]. Another example is that of authors and pa- pers: for a conference such as SIGMOD, reviewers learn infor- mation about submitted papers (title, area, abstract), and could (in future) also see detailed information about authors who have sub- mitted papers, in order to verify conflicts of interest. But, since SIGMOD is a double-blind conference, the association between au- thors and papers should not be revealed to reviewers. The most natural way to model such data is as a graph structure: nodes represent entities, and edges indicate an association between * Yu and Zhang were partially sponsored by the NSF through grants IIS- 0430166 and CNS-0747247. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear,
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern