Loading…

Accelerated distributed expectation-maximization algorithms for the parameter estimation in multivariate Gaussian mixture models

Rapid development for modeling big data requires effective and efficient methods for estimating the parameters involved. Although several accelerated Expectation-Maximization algorithms have been developed, there still exist two major concerns: reducing computational cost and improving model estimat...

Full description

Saved in:
Bibliographic Details
Published in:Applied mathematical modelling 2025-01, Vol.137, p.115709, Article 115709
Main Authors: Guo, Guangbao, Wang, Qian, Allison, James, Qian, Guoqi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Rapid development for modeling big data requires effective and efficient methods for estimating the parameters involved. Although several accelerated Expectation-Maximization algorithms have been developed, there still exist two major concerns: reducing computational cost and improving model estimation accuracy. We propose three distributed-like algorithms for multivariate Gaussian mixture models, which can accelerate speed and improve estimation accuracy. The first algorithm is distributed algorithm, which is used to speed up the calculation of classic algorithms and improve its estimation accuracy by averaging the one-step estimators obtained from distributed operators. The second algorithm is distributed online algorithm, which is a distributed stochastic approximation procedure that performs online updates when reading online data. The final algorithm is called distributed monotonically over-relaxed algorithm, which uses an over-relaxation factor and a distributing strategy to improve the estimation accuracy of multivariate Gaussian mixture models. We investigate the stability, sensitivity, convergence, and robustness of these algorithms in a numerical study. We also apply these algorithms to three real data sets for validation. •Multivariate Gaussian mixture models are proposed to fit Magic, HTRU2 and Skin segmentation data sets.•Distributed online EM algorithms are applied to analyze these real data sets.•Distributed monotonically overrelaxed EM method is developed for real data modeling and analysis.
ISSN:0307-904X
DOI:10.1016/j.apm.2024.115709