Loading…
New RNN Activation Technique for Deeper Networks: LSTCM Cells
Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM)...
Saved in:
Published in: | IEEE access 2020, Vol.8, p.214625-214632 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63 |
---|---|
cites | cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63 |
container_end_page | 214632 |
container_issue | |
container_start_page | 214625 |
container_title | IEEE access |
container_volume | 8 |
creator | Kang, Soo-Han Han, Ji-Hyeong |
description | Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets. |
doi_str_mv | 10.1109/ACCESS.2020.3040405 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2469478798</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9269980</ieee_id><doaj_id>oai_doaj_org_article_78e3d219dcda4ec19e9727d333d787aa</doaj_id><sourcerecordid>2469478798</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</originalsourceid><addsrcrecordid>eNpNUE1PwzAMrRBITGO_YJdKnDvy1SZB4jCVr0mjSGycozRxoaMsI-2Y-PdkdJqwD7as957tF0VjjCYYI3k1zfO7xWJCEEETiljI9CQaEJzJhKY0O_3Xn0ejtl2hECKMUj6IbgrYxS9FEU9NV3_rrnbreAnmfV1_bSGunI9vATbg4wK6nfMf7XU8XyzzpziHpmkvorNKNy2MDnUYvd7fLfPHZP78MMun88QwJLpEpJgYhDWzpUWmRAxjzQkRaWa50BlgZCsiUy1SnuKKEFYaDZzYMmNMozKjw2jW61qnV2rj60_tf5TTtfobOP-mtO9q04DiAqglWFpjNQODJUhOuKWUhl1c66B12WttvAs_tp1aua1fh_MVYZlkASVFQNEeZbxrWw_VcStGam-76m1Xe9vVwfbAGvesGgCODEkyKQWiv_3ye0s</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2469478798</pqid></control><display><type>article</type><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><source>IEEE Open Access Journals</source><creator>Kang, Soo-Han ; Han, Ji-Hyeong</creator><creatorcontrib>Kang, Soo-Han ; Han, Ji-Hyeong</creatorcontrib><description>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3040405</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Computer architecture ; Datasets ; Hyperbolic functions ; language modeling ; Logic gates ; Long short-term memory ; Mathematical model ; Microprocessors ; Natural languages ; neural machine translation ; Recurrent neural networks ; Task analysis</subject><ispartof>IEEE access, 2020, Vol.8, p.214625-214632</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</citedby><cites>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</cites><orcidid>0000-0001-8391-6898</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9269980$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Kang, Soo-Han</creatorcontrib><creatorcontrib>Han, Ji-Hyeong</creatorcontrib><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><title>IEEE access</title><addtitle>Access</addtitle><description>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</description><subject>Computer architecture</subject><subject>Datasets</subject><subject>Hyperbolic functions</subject><subject>language modeling</subject><subject>Logic gates</subject><subject>Long short-term memory</subject><subject>Mathematical model</subject><subject>Microprocessors</subject><subject>Natural languages</subject><subject>neural machine translation</subject><subject>Recurrent neural networks</subject><subject>Task analysis</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>DOA</sourceid><recordid>eNpNUE1PwzAMrRBITGO_YJdKnDvy1SZB4jCVr0mjSGycozRxoaMsI-2Y-PdkdJqwD7as957tF0VjjCYYI3k1zfO7xWJCEEETiljI9CQaEJzJhKY0O_3Xn0ejtl2hECKMUj6IbgrYxS9FEU9NV3_rrnbreAnmfV1_bSGunI9vATbg4wK6nfMf7XU8XyzzpziHpmkvorNKNy2MDnUYvd7fLfPHZP78MMun88QwJLpEpJgYhDWzpUWmRAxjzQkRaWa50BlgZCsiUy1SnuKKEFYaDZzYMmNMozKjw2jW61qnV2rj60_tf5TTtfobOP-mtO9q04DiAqglWFpjNQODJUhOuKWUhl1c66B12WttvAs_tp1aua1fh_MVYZlkASVFQNEeZbxrWw_VcStGam-76m1Xe9vVwfbAGvesGgCODEkyKQWiv_3ye0s</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Kang, Soo-Han</creator><creator>Han, Ji-Hyeong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8391-6898</orcidid></search><sort><creationdate>2020</creationdate><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><author>Kang, Soo-Han ; Han, Ji-Hyeong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer architecture</topic><topic>Datasets</topic><topic>Hyperbolic functions</topic><topic>language modeling</topic><topic>Logic gates</topic><topic>Long short-term memory</topic><topic>Mathematical model</topic><topic>Microprocessors</topic><topic>Natural languages</topic><topic>neural machine translation</topic><topic>Recurrent neural networks</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kang, Soo-Han</creatorcontrib><creatorcontrib>Han, Ji-Hyeong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library Online</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kang, Soo-Han</au><au>Han, Ji-Hyeong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>New RNN Activation Technique for Deeper Networks: LSTCM Cells</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>214625</spage><epage>214632</epage><pages>214625-214632</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3040405</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-8391-6898</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2020, Vol.8, p.214625-214632 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_proquest_journals_2469478798 |
source | IEEE Open Access Journals |
subjects | Computer architecture Datasets Hyperbolic functions language modeling Logic gates Long short-term memory Mathematical model Microprocessors Natural languages neural machine translation Recurrent neural networks Task analysis |
title | New RNN Activation Technique for Deeper Networks: LSTCM Cells |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T00%3A00%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=New%20RNN%20Activation%20Technique%20for%20Deeper%20Networks:%20LSTCM%20Cells&rft.jtitle=IEEE%20access&rft.au=Kang,%20Soo-Han&rft.date=2020&rft.volume=8&rft.spage=214625&rft.epage=214632&rft.pages=214625-214632&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3040405&rft_dat=%3Cproquest_doaj_%3E2469478798%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2469478798&rft_id=info:pmid/&rft_ieee_id=9269980&rfr_iscdi=true |