Loading…

A Differential Privacy Support Vector Machine Classifier Based on Dual Variable Perturbation

Data mining technology can be used to dig out potential and valuable information from massive data, and support vector machine (SVM) is one of the most widely used and most efficient methods in the field of data mining classification. However, the training set data often contains sensitive attribute...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2019, Vol.7, p.98238-98251
Main Authors: Zhang, Yaling, Hao, Zhifeng, Wang, Shangping
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data mining technology can be used to dig out potential and valuable information from massive data, and support vector machine (SVM) is one of the most widely used and most efficient methods in the field of data mining classification. However, the training set data often contains sensitive attributes, and the traditional training method of SVM reveals the individual privacy information. In view of the low prediction accuracy and poor versatility of the existing SVM classifiers with privacy protection, this paper proposed a new SVM training method for differential privacy protection. The algorithm first solved the dual problem of SVM by using SMO method and the difference E_{i} between the estimated value and the real value for each support vector was recorded. Then the ratio of the E_{i} of each support vector to the sum of the E_{i} of all the support vectors was calculated. Next, different levels of Laplace random noise were added to the corresponding dual variables \alpha _{i} of each support vector to be released, according to the ratio of each support vector. According to the principle of differential privacy protection, the algorithm meets \epsilon -differential privacy which can be used to effectively protect individual privacy. Experimental results on real datasets showed that the algorithm proposed in this paper could be used for classification prediction under a reasonable privacy budget.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2019.2929680