Loading…

A gradient approximation algorithm based weight momentum for restricted Boltzmann machine

•A gradient approximation algorithm is proposed for the restricted Boltzmann machine.•A weight-decay momentum is added to the RBM's pre-training and fine-tuning phases.•The algorithm is evaluated on the MNIST, extended Yale B, and CMU-PIE databases.•The proposed average contrastive divergence m...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2019-10, Vol.361, p.40-49
Main Authors: Shen, Huihui, Li, Hongwei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A gradient approximation algorithm is proposed for the restricted Boltzmann machine.•A weight-decay momentum is added to the RBM's pre-training and fine-tuning phases.•The algorithm is evaluated on the MNIST, extended Yale B, and CMU-PIE databases.•The proposed average contrastive divergence method outperforms other algorithms. Restricted Boltzmann Machine (RBM) is a powerful generative model in deep learning that can automatically conduct learning of data probability distributions without supervision. Deep architectures can effectively enhance the capability of image feature expression in image recognition. However, their learning time is long or they present poor performance given the same running time in a deep model. To address these problems, we propose a new gradient approximation algorithm called average contrastive divergence (ACD) with weight-decay momentum for training the RBM. It is an improved contrastive divergence (CD) algorithm combined with weight-decay momentum. Different combinations of the weight momentum term are added in the pre-training and fine-tuning phases of the RBM to accelerate the network convergence and improve the classification effect. Finally, the proposed algorithm is evaluated on the MNIST database, Extended Yale B and CMU-PIE face databases. The experimental results show that the proposed learning algorithm is a better approximation of the log-likelihood gradient method and outperforms the other algorithms.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2019.07.074