Inference Problem Privacy Preserving Data Mining Lecture 20

CSCE 522 - Farkas 2 Lecture 19 Statistical Databases Goal: provide aggregate information about groups of individuals E.g., average grade point of students Security risk: specific information about a particular individual E.g., grade point of student John Smith Meta-data: Working knowledge about the attributes Supplementary knowledge (not stored in database)
CSCE 522 - Farkas 3 Lecture 19 Types of Statistics Macro-statistics: collections of related statistics presented in 2- dimensional tables Micro-statistics: Individual data records used for statistics after identifying information is removed Sex\Year 1997 1998 Sum Female 4 1 5 Male 6 13 19 Sum 10 14 24 Sex Course GPA Year F CSCE 590 3.5 2000 M CSCE 590 3.0 2000 F CSCE 790 4.0 2001

4 Lecture 19 Statistical Compromise Exact compromise: find exact value of an attribute of an individual (e.g., John Smith’s GPA is 3.8) Partial compromise: find an estimate of an
