Loading…

GFD-SSL: generative federated knowledge distillation-based semi-supervised learning

Federated semi-supervised learning (Fed-SSL) algorithms have been developed to address the challenges of decentralized data access, data confidentiality, and costly data labeling in distributed environments. Most existing Fed-SSL algorithms are based on the federated averaging approach, which utiliz...

Full description

Saved in:

Bibliographic Details
Published in:	International journal of machine learning and cybernetics 2024-12, Vol.15 (12), p.5509-5529
Main Authors:	Karami, Ali, Ramezani, Reza, Baraani Dastjerdi, Ahmad
Format:	Article
Language:	English
Subjects:	Accuracy Algorithms Artificial Intelligence Communication Complex Systems Computational Intelligence Control Data communication Datasets Efficiency Engineering Knowledge Labeling Machine learning Mechatronics Methods Original Article Pattern Recognition Robotics Semi-supervised learning Systems Biology
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Federated semi-supervised learning (Fed-SSL) algorithms have been developed to address the challenges of decentralized data access, data confidentiality, and costly data labeling in distributed environments. Most existing Fed-SSL algorithms are based on the federated averaging approach, which utilizes an equivalent model on all machines and replaces local models during the learning process. However, these algorithms suffer from significant communication overhead when transferring parameters of local models. In contrast, knowledge distillation-based Fed-SSL algorithms reduce communication costs by only transferring the output of local models on shared data between machines. However, these algorithms assume that all local data on the machines are labeled, and that there exists a large set of shared unlabeled data for training. These assumptions are not always feasible in real-world applications. In this paper, a knowledge distillation-based Fed-SSL algorithm has been presented, which does not make any assumptions about how the data is distributed among machines. Additionally, it artificially generates shared data required for the learning process. The learning process of the presented approach employs a semi-supervised GAN on local machines and has two stages. In the first stage, each machine trains its local model independently. In the second stage, each machine generates some artificial data in each step and propagates it to other machines. Each machine trains its discriminator with these data and the average output of all machines on these data. The effectiveness of this algorithm has been examined in terms of accuracy and the amount of communication among machines by using different data sets with different distributions. The evaluations reveal that, on average, the presented algorithm is 15% more accurate than state-of-the-art methods, especially in the case of non-IID data. In addition, in most cases, it yields better results than existing studies in terms of the amount of data communication among machines.
ISSN:	1868-8071 1868-808X
DOI:	10.1007/s13042-024-02256-7