Loading…

Differential Privacy Preservation in Adaptive K-Nets Clustering

K-Nets is a deterministic clustering algorithm based on the network structure. It can automatically detect the sym-metric structure in the data and can be used to process clusters of different sizes, shapes or a specific number. However, K-Nets has the following shortcomings: (1) the clustering resu...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Xiaohong, Cai, Hanbo, Li, De, Li, Xianxian, Wang, Jinyan
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:K-Nets is a deterministic clustering algorithm based on the network structure. It can automatically detect the sym-metric structure in the data and can be used to process clusters of different sizes, shapes or a specific number. However, K-Nets has the following shortcomings: (1) the clustering result is more sensitive to the manually input parameter K, so the accuracy will be affected; (2) the algorithm only considers the average distance of K-nearest neighbors, which may lead to some wrong distribution center points in the dataset with large density difference or the same score values during calculation; (3) it does not consider the privacy leakage during the clustering process. To solve the above problems, we propose a differential privacy protection method in adaptive K-Nets clustering, called ADP-K-Nets. Firstly, for reducing the influence of the parameters on the result, the natural eigenvalues are adaptively obtained through the characteristic of the natural neighbors and used as parameter values to find data points. Then we define a new method for calculating the score, which can solve the problem of incorrectly selecting cluster centers when there are large density differences or conflicts in the calculation process. Also, the Laplace noise is added in calculating the local density of every data point to protect data privacy. Experimental results show that our method ensures the performance of clustering compared with some existing algorithms.
ISSN:2324-9013
DOI:10.1109/TrustCom53373.2021.00068