Loading…

Sign language recognition and translation network based on multi-view data

Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convo...

Full description

Saved in:

Bibliographic Details
Published in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-10, Vol.52 (13), p.14624-14638
Main Authors:	Li, Ronghui, Meng, Lu
Format:	Article
Language:	English
Subjects:	Algorithms Artificial Intelligence Artificial neural networks Computer Science Datasets Feature extraction Hearing disorders Language translation Machines Manufacturing Mechanical Engineering Neural networks Processes Recognition Recurrent neural networks Sentences Sign language Special Issue on Multi-view Learning Translation methods and strategies
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3
cites	cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3
container_end_page	14638
container_issue	13
container_start_page	14624
container_title	Applied intelligence (Dordrecht, Netherlands)
container_volume	52
creator	Li, Ronghui Meng, Lu
description	Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.
doi_str_mv	10.1007/s10489-022-03407-5
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2719936966</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2719936966</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJkybNURY_WfCggreQtJPStZuuSeviv7fuCt48DTM87zvwEHLO4JIBqKvEQJSaQp5T4AIULQ7IjBWKUyW0OiQz0LmgUuq3Y3KS0goAOAc2I4_PbROyzoZmtA1mEau-Ce3Q9iGzoc6GaEPq7G4POGz7-J45m7DOpsN67IaWfra4zWo72FNy5G2X8Ox3zsnr7c3L4p4un-4eFtdLWnGmB4plLTQKVirJCpSohebcoc-dKL0CVTmZe14g5-hdaRlw551lVjgNzAvkc3Kx793E_mPENJhVP8YwvTS5YlpzqaWcqHxPVbFPKaI3m9iubfwyDMyPM7N3ZiZnZufMFFOI70NpgkOD8a_6n9Q3uqZvlw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2719936966</pqid></control><display><type>article</type><title>Sign language recognition and translation network based on multi-view data</title><source>ABI/INFORM Collection</source><source>Springer Nature</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Li, Ronghui ; Meng, Lu</creator><creatorcontrib>Li, Ronghui ; Meng, Lu</creatorcontrib><description>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-022-03407-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial Intelligence ; Artificial neural networks ; Computer Science ; Datasets ; Feature extraction ; Hearing disorders ; Language translation ; Machines ; Manufacturing ; Mechanical Engineering ; Neural networks ; Processes ; Recognition ; Recurrent neural networks ; Sentences ; Sign language ; Special Issue on Multi-view Learning ; Translation methods and strategies</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2022-10, Vol.52 (13), p.14624-14638</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</citedby><cites>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</cites><orcidid>0000-0003-2442-8354</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2719936966/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2719936966?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11686,12849,27922,27923,31267,36058,44361,74665</link.rule.ids></links><search><creatorcontrib>Li, Ronghui</creatorcontrib><creatorcontrib>Meng, Lu</creatorcontrib><title>Sign language recognition and translation network based on multi-view data</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Hearing disorders</subject><subject>Language translation</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Neural networks</subject><subject>Processes</subject><subject>Recognition</subject><subject>Recurrent neural networks</subject><subject>Sentences</subject><subject>Sign language</subject><subject>Special Issue on Multi-view Learning</subject><subject>Translation methods and strategies</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><sourceid>M0C</sourceid><recordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJkybNURY_WfCggreQtJPStZuuSeviv7fuCt48DTM87zvwEHLO4JIBqKvEQJSaQp5T4AIULQ7IjBWKUyW0OiQz0LmgUuq3Y3KS0goAOAc2I4_PbROyzoZmtA1mEau-Ce3Q9iGzoc6GaEPq7G4POGz7-J45m7DOpsN67IaWfra4zWo72FNy5G2X8Ox3zsnr7c3L4p4un-4eFtdLWnGmB4plLTQKVirJCpSohebcoc-dKL0CVTmZe14g5-hdaRlw551lVjgNzAvkc3Kx793E_mPENJhVP8YwvTS5YlpzqaWcqHxPVbFPKaI3m9iubfwyDMyPM7N3ZiZnZufMFFOI70NpgkOD8a_6n9Q3uqZvlw</recordid><startdate>20221001</startdate><enddate>20221001</enddate><creator>Li, Ronghui</creator><creator>Meng, Lu</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0003-2442-8354</orcidid></search><sort><creationdate>20221001</creationdate><title>Sign language recognition and translation network based on multi-view data</title><author>Li, Ronghui ; Meng, Lu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Hearing disorders</topic><topic>Language translation</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Neural networks</topic><topic>Processes</topic><topic>Recognition</topic><topic>Recurrent neural networks</topic><topic>Sentences</topic><topic>Sign language</topic><topic>Special Issue on Multi-view Learning</topic><topic>Translation methods and strategies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Ronghui</creatorcontrib><creatorcontrib>Meng, Lu</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Ronghui</au><au>Meng, Lu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sign language recognition and translation network based on multi-view data</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2022-10-01</date><risdate>2022</risdate><volume>52</volume><issue>13</issue><spage>14624</spage><epage>14638</epage><pages>14624-14638</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-022-03407-5</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-2442-8354</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0924-669X
ispartof	Applied intelligence (Dordrecht, Netherlands), 2022-10, Vol.52 (13), p.14624-14638
issn	0924-669X 1573-7497
language	eng
recordid	cdi_proquest_journals_2719936966
source	ABI/INFORM Collection; Springer Nature; Linguistics and Language Behavior Abstracts (LLBA)
subjects	Algorithms Artificial Intelligence Artificial neural networks Computer Science Datasets Feature extraction Hearing disorders Language translation Machines Manufacturing Mechanical Engineering Neural networks Processes Recognition Recurrent neural networks Sentences Sign language Special Issue on Multi-view Learning Translation methods and strategies
title	Sign language recognition and translation network based on multi-view data
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T12%3A23%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sign%20language%20recognition%20and%20translation%20network%20based%20on%20multi-view%20data&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Li,%20Ronghui&rft.date=2022-10-01&rft.volume=52&rft.issue=13&rft.spage=14624&rft.epage=14638&rft.pages=14624-14638&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-022-03407-5&rft_dat=%3Cproquest_cross%3E2719936966%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2719936966&rft_id=info:pmid/&rfr_iscdi=true