Loading…

Exploring the distributed learning on federated learning and cluster computing via convolutional neural networks

Distributed learning has led to the development of federated learning and cluster computing; however, the two methods are very different. Therefore, this study uses a deep learning approach to investigate the distinction between federated learning and cluster computing. Specifically, the LeNet convo...

Full description

Saved in:

Bibliographic Details
Published in:	Neural computing & applications 2024-02, Vol.36 (5), p.2141-2153
Main Authors:	Chang, Jia-Wei, Hung, Jason C., Chu, Ting-Hong
Format:	Article
Language:	English
Subjects:	Artificial Intelligence Artificial neural networks Clusters Computation Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Deep learning Federated learning Image Processing and Computer Vision Neural networks Nodes Probability and Statistics in Computer Science S.I.: Machine Learning and Big Data Analytics for IoT Security and Privacy (SPIoT 2022) Special Issue on Machine Learning ale 1nd Big Data Analytics for IoT Security and Privacy (SPIoT 2022)
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Distributed learning has led to the development of federated learning and cluster computing; however, the two methods are very different. Therefore, this study uses a deep learning approach to investigate the distinction between federated learning and cluster computing. Specifically, the LeNet convolutional neural network model is used. Three frameworks were tested, including Spark on Hadoop with four nodes, PySyft with four nodes, and native PyTorch with a single node. The results show that Spark on Hadoop can accelerate performance and facilitate applications that have large memory requirements. In addition, PySyft can protect data privacy but is slower than Spark on Hadoop and native PyTorch. The three frameworks performed comparable accuracy for IID distributions, while PySyft had the worst for non-IID data. Therefore, if excluding sensitive data does not significantly affect training results, the results suggest that cluster computing, Spark on Hadoop, is recommended. However, federated learning, PySyft, is recommended in cases where sensitive data is required for training or positively affects training results, and time constraints are not an issue.
ISSN:	0941-0643 1433-3058
DOI:	10.1007/s00521-023-09160-1