Loading…

A Privacy Measure for Data Disclosure to Publish Micro Data using (N,T) - Closeness

Closeness is described as a privacy measure and its advantages are illustrated through examples and experiments on a real dataset. In this Paper the closeness can be verified by giving different values for N and T. Government agencies and other organizations often need to publish micro data, e. g. ,...

Full description

Saved in:
Bibliographic Details
Published in:International journal of computer applications 2012-01, Vol.51 (6), p.22-28
Main Authors: Sunitha, A, Subba Reddy, K Venkata, Vijayakumar, B
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Closeness is described as a privacy measure and its advantages are illustrated through examples and experiments on a real dataset. In this Paper the closeness can be verified by giving different values for N and T. Government agencies and other organizations often need to publish micro data, e. g. , medical data or census data, for research and other purposes. Typically, such data are stored in a table, and each record (row) corresponds to one individual. Generally if we want to publish micro data A common anonymization approach is generalization, which replaces quasi-identifier values with values that are less-specific but semantically consistent. As a result, more records will have the same set of quasi-identifier values. An equivalence class of an anonymized table is defined to be a set of records that have the same values for the quasi-identifiers To effectively limit disclosure, the disclosure risk of an anonymized table is to be measured. To this end, k-anonymity is introduced as the property that each record is indistinguishable with at least k-1 other records with respect to the quasi-identifier i. e. , k-anonymity requires that each equivalence class contains at least k records. While k-anonymity protects against identity disclosure, it is insufficient to prevent attribute disclosure. To address the above limitation of k-anonymity, a new notion of privacy, called l-diversity is introduced, which requires that the distribution of a sensitive attribute in each equivalence class has at least l "well represented" values. One problem with l-diversity is that it is limited in its assumption of adversarial knowledge. This assumption generalizes the specific background and homogeneity attacks used to motivate l-diversity. The k-anonymity privacy requirement for publishing micro data requires that each equivalence class contains at least k records. But k-anonymity cannot prevent attribute disclosure. The notion of l-diversity has been proposed to address this; l-diversity requires that each equivalence class has at least l well-represented values for each sensitive attribute. L-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Due to these limitations, a new notion of privacy called "closeness" is proposed. First the base model t- closeness is presented, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attrib
ISSN:0975-8887
0975-8887
DOI:10.5120/8047-1379