Loading…
TanhSoft-Dynamic Trainable Activation Functions for Faster Learning and Better Performance
Deep learning, at its core, contains functions that are the composition of a linear transformation with a nonlinear function known as the activation function. In the past few years, there is an increasing interest in the construction of novel activation functions resulting in better learning. In thi...
Saved in:
Published in: | IEEE access 2021, Vol.9, p.120613-120623 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deep learning, at its core, contains functions that are the composition of a linear transformation with a nonlinear function known as the activation function. In the past few years, there is an increasing interest in the construction of novel activation functions resulting in better learning. In this work, we propose three novel activation functions with learnable parameters, namely TanhSoft-1, TanhSoft-2, and TanhSoft-3, which are shown to outperform several well-known activation functions. For instance, replacing ReLU with TanhSoft-1, TanhSoft-2, and Tanhsot-3 improves top-1 classification accuracy by 6.06%, 5.75%, and 5.38% respectively on VGG-16(with batch-normalization), by 3.02%, 3.25% and 2.93% respectively on PreActResNet-34 in CIFAR-100 dataset, by 1.76%, 1.93%, and 1.82% respectively on WideResNet 28-10 in Tiny ImageNet dataset. TanhSoft-1, TanhSoft-2, and Tanhsot-3 outperformed ReLU on mean average precision (mAP) by 0.7%, 0.8%, and 0.6% respectively in object detection problem on SSD 300 model in Pascal VOC dataset. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2021.3105355 |