Loading…
New RNN Activation Technique for Deeper Networks: LSTCM Cells
Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM)...
Saved in:
Published in: | IEEE access 2020, Vol.8, p.214625-214632 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2020.3040405 |