Loading…

Improving generalization in deep neural network using knowledge transformation based on fisher criterion

Most deep neural networks (DNNs) are trained in an over-parametrized regime. In this case, the numbers of their parameters are more than available training data which reduces the generalization capability and performance on new and unseen samples. Generalization of DNNs has been improved by applying...

Full description

Saved in:

Bibliographic Details
Published in:	The Journal of supercomputing 2023-12, Vol.79 (18), p.20899-20922
Main Authors:	Morabbi, Sajedeh, Soltanizadeh, Hadi, Mozaffari, Saeed, Fadaeieslam, Mohammad Javad
Format:	Article
Language:	English
Subjects:	Artificial neural networks Compilers Computer Science Interpreters Multivariate analysis Neural networks Processor Architectures Programming Languages Regularization Samples Transformations
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Most deep neural networks (DNNs) are trained in an over-parametrized regime. In this case, the numbers of their parameters are more than available training data which reduces the generalization capability and performance on new and unseen samples. Generalization of DNNs has been improved by applying various methods such as regularization techniques, data enhancement, network capacity restriction, injection randomness, etc. In this paper, we proposed an effective generalization method, named multivariate statistical knowledge transformation, which learns feature distribution to separate samples based on the variance of deep hypothesis space in all dimensions. Moreover, the proposed method uses latent knowledge of the target to boost the confidence of its prediction. Compared to state-of-the-art methods, the transformation of multivariate statistical knowledge yields competitive results. Experimental results show that the proposed method achieved impressive generalization performance on CIFAR-10, CIFAR-100, and Tiny ImageNet with accuracy of 91.96%, 97.52%, and 99.21% respectively. Furthermore, this method enables faster convergence during the initial epochs.
ISSN:	0920-8542 1573-0484
DOI:	10.1007/s11227-023-05448-0