Loading…

Stagewise Training With Exponentially Growing Training Sets

In the world of big data, training large-scale machine learning problems has gained considerable attention. Numerous innovative optimization strategies have been presented in recent years to accelerate the large-scale training process. However, the possibility of further accelerating the training pr...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transaction on neural networks and learning systems 2024-05, Vol.PP, p.1-11
Main Authors: Gu, Bin, AlQuabeh, Hilal, de Vazelhes, William, Huo, Zhouyuan, Huang, Heng
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the world of big data, training large-scale machine learning problems has gained considerable attention. Numerous innovative optimization strategies have been presented in recent years to accelerate the large-scale training process. However, the possibility of further accelerating the training process of various optimization algorithms remains an unresolved subject. To begin addressing this difficult problem, we exploit the researched findings that when training data are independent and identically distributed, the learning problem on a smaller dataset is not significantly different from the original one. Upon that, we propose a stagewise training technique that grows the size of the training set exponentially while solving nonsmooth subproblem. We demonstrate that our stagewise training via exponentially growing the size of the training sets (STEGSs) are compatible with a large number of proximal gradient descent and gradient hard thresholding (GHT) techniques. Interestingly, we demonstrate that STEGS can greatly reduce overall complexity while maintaining statistical accuracy or even surpassing the intrinsic error introduced by GHT approaches. In addition, we analyze the effect of the training data growth rate on the overall complexity. The practical results of applying l_{2,1} -and l_0 -norms to a variety of large-scale real-world datasets not only corroborate our theories but also demonstrate the benefits of our STEGS framework.
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2024.3402108