Loading…

Privacy preserving rare itemset mining

In recent years, rare pattern mining has shown great vitality in some real-world fields, such as disease diagnosis, criminal behavior analysis, anomaly detection in networks, and so on. When data organizations publish or share information publicly, shared data can be at risk of leakage as data minin...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences 2024-03, Vol.662, p.120262, Article 120262
Main Authors: Gui, Yijie, Gan, Wensheng, Wu, Yongdong, Yu, Philip S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In recent years, rare pattern mining has shown great vitality in some real-world fields, such as disease diagnosis, criminal behavior analysis, anomaly detection in networks, and so on. When data organizations publish or share information publicly, shared data can be at risk of leakage as data mining techniques may discover sensitive knowledge and information. To keep competitors from obtaining hidden information after processing the database, privacy-preserving data mining (PPDM) has been proposed and studied widely. However, most of the techniques in PPDM are applied to frequent pattern mining and cannot deal with the privacy protection problems in rare pattern mining, such as network vulnerability detection and abnormal medical data. To address this limitation, we introduce a privacy-preserving technique for rare pattern mining. In this paper, two novel algorithms named Longest Transaction-Minimum Item Number (LT-MIN) and Longest Transaction-Maximum Item Number (LT-MAX) are proposed to hide sensitive rare itemsets and return the sanitized database. These two algorithms succeed in hiding target itemsets while minimizing the side effects on the original database. What's more, they employ a projection mechanism to reduce the time spent scanning the database. Besides using the traditional evaluation criteria in PPDM, we also propose two additional similarity measures to evaluate the performance from the perspective of the itemsets and the structural integrity of the database. The experimental results indicate that the proposed algorithms can hide sensitive rare itemsets successfully and efficiently, and the evaluation methods used can become the evaluation criteria for privacy-preserving rare itemset mining (PPRIM).
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2024.120262