Loading…

Fast Deep Stacked Networks based on Extreme Learning Machine applied to regression problems

Deep learning techniques are commonly used to process large amounts of data, and good results are obtained in many applications. Those methods, however, can lead to long training times. An alternative to simultaneously tune all parameters of a large network is to stack smaller modules, improving the...

Full description

Saved in:

Bibliographic Details
Published in:	Neural networks 2020-11, Vol.131, p.14-28
Main Authors:	da Silva, Bruno Légora Souza, Inaba, Fernando Kentaro, Salles, Evandro Ottoni Teatini, Ciarelli, Patrick Marques
Format:	Article
Language:	English
Subjects:	Deep Stacked Network Extreme Learning Machine Machine Learning Regression Regression Analysis Stacking principle
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep learning techniques are commonly used to process large amounts of data, and good results are obtained in many applications. Those methods, however, can lead to long training times. An alternative to simultaneously tune all parameters of a large network is to stack smaller modules, improving the model efficiency. However, methods such as Deep Stacked Network (DSN) have some problems that increase its training time and memory usage. To deal with these problems, Fast DSN (FDSN) was proposed, where the modules are trained using an Extreme Learning Machine (ELM) variant. Nonetheless, to speed-up the FDSN training, the ELM random feature mapping is shared among the modules, which can impact the network performance if the weights are not properly chosen. In this paper, we focus on the weight initialization of FDSN in order to improve its performance. We also propose FKDSN, a kernel-based variant of FDSN, besides discussing the theoretical complexity of the methods. We evaluate three different initialization approaches on ELM-trained neural networks over 50 public real-world regression datasets. Our experiments show that FDSN when combined with a more complex initialization method achieves similar results to ELM algorithms applied to large SLFNs, besides having a shorter training time and memory usage, implying that it can be suitable to be used on systems with restrict resources, such as Internet of Things devices. FKDSN also obtained similar results and training time to the large SLFNs, requiring less memory.
ISSN:	0893-6080 1879-2782
DOI:	10.1016/j.neunet.2020.07.018