Loading…

New RNN Activation Technique for Deeper Networks: LSTCM Cells

Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM)...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2020, Vol.8, p.214625-214632
Main Authors: Kang, Soo-Han, Han, Ji-Hyeong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63
cites cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63
container_end_page 214632
container_issue
container_start_page 214625
container_title IEEE access
container_volume 8
creator Kang, Soo-Han
Han, Ji-Hyeong
description Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.
doi_str_mv 10.1109/ACCESS.2020.3040405
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2469478798</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9269980</ieee_id><doaj_id>oai_doaj_org_article_78e3d219dcda4ec19e9727d333d787aa</doaj_id><sourcerecordid>2469478798</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</originalsourceid><addsrcrecordid>eNpNUE1PwzAMrRBITGO_YJdKnDvy1SZB4jCVr0mjSGycozRxoaMsI-2Y-PdkdJqwD7as957tF0VjjCYYI3k1zfO7xWJCEEETiljI9CQaEJzJhKY0O_3Xn0ejtl2hECKMUj6IbgrYxS9FEU9NV3_rrnbreAnmfV1_bSGunI9vATbg4wK6nfMf7XU8XyzzpziHpmkvorNKNy2MDnUYvd7fLfPHZP78MMun88QwJLpEpJgYhDWzpUWmRAxjzQkRaWa50BlgZCsiUy1SnuKKEFYaDZzYMmNMozKjw2jW61qnV2rj60_tf5TTtfobOP-mtO9q04DiAqglWFpjNQODJUhOuKWUhl1c66B12WttvAs_tp1aua1fh_MVYZlkASVFQNEeZbxrWw_VcStGam-76m1Xe9vVwfbAGvesGgCODEkyKQWiv_3ye0s</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2469478798</pqid></control><display><type>article</type><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><source>IEEE Open Access Journals</source><creator>Kang, Soo-Han ; Han, Ji-Hyeong</creator><creatorcontrib>Kang, Soo-Han ; Han, Ji-Hyeong</creatorcontrib><description>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3040405</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Computer architecture ; Datasets ; Hyperbolic functions ; language modeling ; Logic gates ; Long short-term memory ; Mathematical model ; Microprocessors ; Natural languages ; neural machine translation ; Recurrent neural networks ; Task analysis</subject><ispartof>IEEE access, 2020, Vol.8, p.214625-214632</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</citedby><cites>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</cites><orcidid>0000-0001-8391-6898</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9269980$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Kang, Soo-Han</creatorcontrib><creatorcontrib>Han, Ji-Hyeong</creatorcontrib><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><title>IEEE access</title><addtitle>Access</addtitle><description>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</description><subject>Computer architecture</subject><subject>Datasets</subject><subject>Hyperbolic functions</subject><subject>language modeling</subject><subject>Logic gates</subject><subject>Long short-term memory</subject><subject>Mathematical model</subject><subject>Microprocessors</subject><subject>Natural languages</subject><subject>neural machine translation</subject><subject>Recurrent neural networks</subject><subject>Task analysis</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>DOA</sourceid><recordid>eNpNUE1PwzAMrRBITGO_YJdKnDvy1SZB4jCVr0mjSGycozRxoaMsI-2Y-PdkdJqwD7as957tF0VjjCYYI3k1zfO7xWJCEEETiljI9CQaEJzJhKY0O_3Xn0ejtl2hECKMUj6IbgrYxS9FEU9NV3_rrnbreAnmfV1_bSGunI9vATbg4wK6nfMf7XU8XyzzpziHpmkvorNKNy2MDnUYvd7fLfPHZP78MMun88QwJLpEpJgYhDWzpUWmRAxjzQkRaWa50BlgZCsiUy1SnuKKEFYaDZzYMmNMozKjw2jW61qnV2rj60_tf5TTtfobOP-mtO9q04DiAqglWFpjNQODJUhOuKWUhl1c66B12WttvAs_tp1aua1fh_MVYZlkASVFQNEeZbxrWw_VcStGam-76m1Xe9vVwfbAGvesGgCODEkyKQWiv_3ye0s</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Kang, Soo-Han</creator><creator>Han, Ji-Hyeong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8391-6898</orcidid></search><sort><creationdate>2020</creationdate><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><author>Kang, Soo-Han ; Han, Ji-Hyeong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer architecture</topic><topic>Datasets</topic><topic>Hyperbolic functions</topic><topic>language modeling</topic><topic>Logic gates</topic><topic>Long short-term memory</topic><topic>Mathematical model</topic><topic>Microprocessors</topic><topic>Natural languages</topic><topic>neural machine translation</topic><topic>Recurrent neural networks</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kang, Soo-Han</creatorcontrib><creatorcontrib>Han, Ji-Hyeong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library Online</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kang, Soo-Han</au><au>Han, Ji-Hyeong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>New RNN Activation Technique for Deeper Networks: LSTCM Cells</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>214625</spage><epage>214632</epage><pages>214625-214632</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3040405</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-8391-6898</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2020, Vol.8, p.214625-214632
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2469478798
source IEEE Open Access Journals
subjects Computer architecture
Datasets
Hyperbolic functions
language modeling
Logic gates
Long short-term memory
Mathematical model
Microprocessors
Natural languages
neural machine translation
Recurrent neural networks
Task analysis
title New RNN Activation Technique for Deeper Networks: LSTCM Cells
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T00%3A00%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=New%20RNN%20Activation%20Technique%20for%20Deeper%20Networks:%20LSTCM%20Cells&rft.jtitle=IEEE%20access&rft.au=Kang,%20Soo-Han&rft.date=2020&rft.volume=8&rft.spage=214625&rft.epage=214632&rft.pages=214625-214632&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3040405&rft_dat=%3Cproquest_doaj_%3E2469478798%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2469478798&rft_id=info:pmid/&rft_ieee_id=9269980&rfr_iscdi=true