Loading…
Falcon: A Privacy-Preserving and Interpretable Vertical Federated Learning System
Federated learning (FL) enables multiple data owners to collaboratively train machine learning (ML) models without disclosing their raw data. In the vertical federated learning (VFL) setting, the collaborating parties have data from the same set of users but with disjoint attributes. After construct...
Saved in:
Published in: | Proceedings of the VLDB Endowment 2023-06, Vol.16 (10), p.2471-2484 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Federated learning (FL) enables multiple data owners to collaboratively train machine learning (ML) models without disclosing their raw data. In the vertical federated learning (VFL) setting, the collaborating parties have data from the same set of users but with disjoint attributes. After constructing the VFL models, the parties deploy the models in production systems to infer prediction requests. In practice, the prediction output itself may not be convincing for party users to make the decisions, especially in high-stakes applications. Model interpretability is therefore essential to provide meaningful insights and better comprehension on the prediction output.
In this paper, we propose Falcon, a novel privacy-preserving and interpretable VFL system. First, Falcon supports VFL training and prediction with strong and efficient privacy protection for a wide range of ML models, including linear regression, logistic regression, and multi-layer perceptron. The protection is achieved by a hybrid strategy of threshold partially homomorphic encryption (PHE) and additive secret sharing scheme (SSS), ensuring no intermediate information disclosure. Second, Falcon facilitates understanding of VFL model predictions by a flexible and privacy-preserving interpretability framework, which enables the implementation of state-of-the-art interpretable methods in a decentralized setting. Third, Falcon supports efficient data parallelism of VFL tasks and optimizes the parallelism factors to reduce the overall execution time. Falcon is fully implemented, and on which, we conduct extensive experiments using six real-world and multiple synthetic datasets. The results demonstrate that Falcon achieves comparable accuracy to non-private algorithms and outperforms three secure baselines in terms of efficiency. |
---|---|
ISSN: | 2150-8097 2150-8097 |
DOI: | 10.14778/3603581.3603588 |