Loading…
Collecting Preference Rankings Under Local Differential Privacy
With the deep penetration of the Internet and mobile devices, preference rankings are being collected on a massive scale by diverse data collectors for various business demands. However, users' preference rankings in many applications are highly sensitive. Without proper privacy protection mech...
Saved in:
Published in: | IEEE transactions on knowledge and data engineering 2023-07, Vol.35 (7), p.6752-6766 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With the deep penetration of the Internet and mobile devices, preference rankings are being collected on a massive scale by diverse data collectors for various business demands. However, users' preference rankings in many applications are highly sensitive. Without proper privacy protection mechanisms, it either puts individual privacy in jeopardy or hampers business opportunities due to users' unwillingness to share their true rankings. In this paper, we initiate the study of collecting preference rankings under local differential privacy. The key technical challenge comes from the fact that the number of possible rankings could be large in practical settings, leading to excessive injected noise. To solve this problem, we present a novel approach SAFARI, whose main idea is to collect a set of distributions over small domains which are carefully chosen based on the riffle independent (RI) model to approximate the overall distribution of users' rankings, and then generate a synthetic ranking dataset from the obtained distributions. By working on small domains instead of a large domain, SAFARI can significantly reduce the magnitude of added noise. In SAFARI, we design two transformation rules, namely Rule I and Rule II, to instruct users to transform their data to provide the information about the distributions of the small domains. In particular, we propose a method called LADE to precisely estimate the required distributions used for the structure learning of RI model. We also propose a new LDP method called SAFA for frequency estimation over multiple attributes that have small domains. We formally prove that SAFARI guarantees \varepsilon É› -local differential privacy. Extensive experiments on real datasets confirm the effectiveness of SAFARI. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2022.3186907 |