Loading…
ps-CALR: Periodic-Shift Cosine Annealing Learning Rate for Deep Neural Networks
There Are Continued Efforts to Build on the Performance of Deep Learning (DL) Models in Various Fields of Application. Developing New DL Models Continues to Open Unprecedented Opportunities in Diverse Application Areas Despite the Enormous Resources Required. Generally, the Learning Mechanism of DL...
Saved in:
Published in: | IEEE access 2023, Vol.11, p.139171-139186 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | There Are Continued Efforts to Build on the Performance of Deep Learning (DL) Models in Various Fields of Application. Developing New DL Models Continues to Open Unprecedented Opportunities in Diverse Application Areas Despite the Enormous Resources Required. Generally, the Learning Mechanism of DL Models Depends on the Term "Cost Function" (CF) or "Loss Function" (LF), and DL Models Require Varied Hyperparameter Settings and, Precisely, Parameters That Can Help the Model to Continually Minimize the Cost Function Until Faster Convergence, With Better Generalization Over the Data in the Loss Landscape, Is Assumed. The Learning Rate (LR) Update Seeks to Find the Optimal Solution for DL Models Through Relative Cost Function Minimization. Therefore, Selecting the Appropriate LR Is Essential to the Performance of DL Models. Despite Its Demonstration for Fast Model Convergence, the Existing Cosine Annealing LR Lacks Complete Loss Landscape Exploration of the Flat Minima, Hence Limiting Its Ability to Model Better Generalization. To Address This, the Paper Proposes a Period-Shift Cosine Annealing Learning Rate With Warm-up Epochs (Ps-CALR) to Perturb the LR Update. Six Publicly Available Datasets Were Used to Benchmark the Proposed LR Method by Experimenting With Custom DL (multilayer Perceptron and Convolutional Neural networks) and Pre-Trained DL Models. The Proposed Ps-CARL Enhances Model Generalization and Convergence, Pushing the Solution to Notably Better Performance Than Fixed LR and the Existing Cosine Annealing Method. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2023.3340719 |