Loading…

New RNN Activation Technique for Deeper Networks: LSTCM Cells

Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM)...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2020, Vol.8, p.214625-214632
Main Authors: Kang, Soo-Han, Han, Ji-Hyeong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3040405