Loading…

Privacy preserving data publishing of categorical data through k-anonymity and feature selection

In healthcare, there is a vast amount of patients’ data, which can lead to important discoveries if combined. Due to legal and ethical issues, such data cannot be shared and hence such information is underused. A new area of research has emerged, called privacy preserving data publishing (PPDP), whi...

Full description

Saved in:
Bibliographic Details
Published in:Healthcare technology letters 2016-03, Vol.3 (1), p.16-21
Main Authors: Aristodimou, Aristos, Antoniades, Athos, Pattichis, Constantinos S
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In healthcare, there is a vast amount of patients’ data, which can lead to important discoveries if combined. Due to legal and ethical issues, such data cannot be shared and hence such information is underused. A new area of research has emerged, called privacy preserving data publishing (PPDP), which aims in sharing data in a way that privacy is preserved while the information lost is kept at a minimum. In this Letter, a new anonymisation algorithm for PPDP is proposed, which is based on k-anonymity through pattern-based multidimensional suppression (kPB-MS). The algorithm uses feature selection for reducing the data dimensionality and then combines attribute and record suppression for obtaining k-anonymity. Five datasets from different areas of life sciences [RETINOPATHY, Single Proton Emission Computed Tomography imaging, gene sequencing and drug discovery (two datasets)], were anonymised with kPB-MS. The produced anonymised datasets were evaluated using four different classifiers and in 74% of the test cases, they produced similar or better accuracies than using the full datasets.
ISSN:2053-3713
2053-3713
DOI:10.1049/htl.2015.0050