Loading…

New RNN Activation Technique for Deeper Networks: LSTCM Cells

Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM)...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2020, Vol.8, p.214625-214632
Main Authors:	Kang, Soo-Han, Han, Ji-Hyeong
Format:	Article
Language:	English
Subjects:	Computer architecture Datasets Hyperbolic functions language modeling Logic gates Long short-term memory Mathematical model Microprocessors Natural languages neural machine translation Recurrent neural networks Task analysis
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63
cites	cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63
container_end_page	214632
container_issue
container_start_page	214625
container_title	IEEE access
container_volume	8
creator	Kang, Soo-Han Han, Ji-Hyeong
description	Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.
doi_str_mv	10.1109/ACCESS.2020.3040405
format	article
fullrecord	<record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_journals_2469478798</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9269980</ieee_id><doaj_id>oai_doaj_org_article_78e3d219dcda4ec19e9727d333d787aa</doaj_id><sourcerecordid>2469478798</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</originalsourceid><addsrcrecordid>eNpNUE1PwzAMrRBITGO_YJdKnDvy1SZB4jCVr0mjSGycozRxoaMsI-2Y-PdkdJqwD7as957tF0VjjCYYI3k1zfO7xWJCEEETiljI9CQaEJzJhKY0O_3Xn0ejtl2hECKMUj6IbgrYxS9FEU9NV3_rrnbreAnmfV1_bSGunI9vATbg4wK6nfMf7XU8XyzzpziHpmkvorNKNy2MDnUYvd7fLfPHZP78MMun88QwJLpEpJgYhDWzpUWmRAxjzQkRaWa50BlgZCsiUy1SnuKKEFYaDZzYMmNMozKjw2jW61qnV2rj60_tf5TTtfobOP-mtO9q04DiAqglWFpjNQODJUhOuKWUhl1c66B12WttvAs_tp1aua1fh_MVYZlkASVFQNEeZbxrWw_VcStGam-76m1Xe9vVwfbAGvesGgCODEkyKQWiv_3ye0s</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2469478798</pqid></control><display><type>article</type><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><source>IEEE Open Access Journals</source><creator>Kang, Soo-Han ; Han, Ji-Hyeong</creator><creatorcontrib>Kang, Soo-Han ; Han, Ji-Hyeong</creatorcontrib><description>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2020.3040405</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Computer architecture ; Datasets ; Hyperbolic functions ; language modeling ; Logic gates ; Long short-term memory ; Mathematical model ; Microprocessors ; Natural languages ; neural machine translation ; Recurrent neural networks ; Task analysis</subject><ispartof>IEEE access, 2020, Vol.8, p.214625-214632</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</citedby><cites>FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</cites><orcidid>0000-0001-8391-6898</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9269980$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,4024,27633,27923,27924,27925,54933</link.rule.ids></links><search><creatorcontrib>Kang, Soo-Han</creatorcontrib><creatorcontrib>Han, Ji-Hyeong</creatorcontrib><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><title>IEEE access</title><addtitle>Access</addtitle><description>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</description><subject>Computer architecture</subject><subject>Datasets</subject><subject>Hyperbolic functions</subject><subject>language modeling</subject><subject>Logic gates</subject><subject>Long short-term memory</subject><subject>Mathematical model</subject><subject>Microprocessors</subject><subject>Natural languages</subject><subject>neural machine translation</subject><subject>Recurrent neural networks</subject><subject>Task analysis</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>DOA</sourceid><recordid>eNpNUE1PwzAMrRBITGO_YJdKnDvy1SZB4jCVr0mjSGycozRxoaMsI-2Y-PdkdJqwD7as957tF0VjjCYYI3k1zfO7xWJCEEETiljI9CQaEJzJhKY0O_3Xn0ejtl2hECKMUj6IbgrYxS9FEU9NV3_rrnbreAnmfV1_bSGunI9vATbg4wK6nfMf7XU8XyzzpziHpmkvorNKNy2MDnUYvd7fLfPHZP78MMun88QwJLpEpJgYhDWzpUWmRAxjzQkRaWa50BlgZCsiUy1SnuKKEFYaDZzYMmNMozKjw2jW61qnV2rj60_tf5TTtfobOP-mtO9q04DiAqglWFpjNQODJUhOuKWUhl1c66B12WttvAs_tp1aua1fh_MVYZlkASVFQNEeZbxrWw_VcStGam-76m1Xe9vVwfbAGvesGgCODEkyKQWiv_3ye0s</recordid><startdate>2020</startdate><enddate>2020</enddate><creator>Kang, Soo-Han</creator><creator>Han, Ji-Hyeong</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8391-6898</orcidid></search><sort><creationdate>2020</creationdate><title>New RNN Activation Technique for Deeper Networks: LSTCM Cells</title><author>Kang, Soo-Han ; Han, Ji-Hyeong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer architecture</topic><topic>Datasets</topic><topic>Hyperbolic functions</topic><topic>language modeling</topic><topic>Logic gates</topic><topic>Long short-term memory</topic><topic>Mathematical model</topic><topic>Microprocessors</topic><topic>Natural languages</topic><topic>neural machine translation</topic><topic>Recurrent neural networks</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kang, Soo-Han</creatorcontrib><creatorcontrib>Han, Ji-Hyeong</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library Online</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kang, Soo-Han</au><au>Han, Ji-Hyeong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>New RNN Activation Technique for Deeper Networks: LSTCM Cells</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2020</date><risdate>2020</risdate><volume>8</volume><spage>214625</spage><epage>214632</epage><pages>214625-214632</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>Long short-term memory (LSTM) has shown good performance when used with sequential data, but gradient vanishing or exploding problem can arise, especially when using deeper layers to solve complex problems. Thus, in this paper, we propose a new LSTM cell termed long short-time complex memory (LSTCM) that applies an activation function to the cell state instead of a hidden state for better convergence in deep layers. Moreover, we propose a sinusoidal function as an activation function for LSTM and the proposed LSTCM instead of a hyperbolic tangent activation function. The performance capabilities of the proposed LSTCM cell and the sinusoidal activation function are demonstrated through experiments on various natural language benchmark datasets, in this case the Penn Tree-bank, IWSLT 2015 English-Vietnamese, and WMT 2014 English-German datasets.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2020.3040405</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-8391-6898</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2169-3536
ispartof	IEEE access, 2020, Vol.8, p.214625-214632
issn	2169-3536 2169-3536
language	eng
recordid	cdi_proquest_journals_2469478798
source	IEEE Open Access Journals
subjects	Computer architecture Datasets Hyperbolic functions language modeling Logic gates Long short-term memory Mathematical model Microprocessors Natural languages neural machine translation Recurrent neural networks Task analysis
title	New RNN Activation Technique for Deeper Networks: LSTCM Cells
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T00%3A00%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=New%20RNN%20Activation%20Technique%20for%20Deeper%20Networks:%20LSTCM%20Cells&rft.jtitle=IEEE%20access&rft.au=Kang,%20Soo-Han&rft.date=2020&rft.volume=8&rft.spage=214625&rft.epage=214632&rft.pages=214625-214632&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2020.3040405&rft_dat=%3Cproquest_doaj_%3E2469478798%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c408t-8512c01a4dbd0cb0411a722856d78a6e10df295a85751f224bcae72db644a0b63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2469478798&rft_id=info:pmid/&rft_ieee_id=9269980&rfr_iscdi=true