Loading…

Sign language recognition and translation network based on multi-view data

Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convo...

Full description

Saved in:
Bibliographic Details
Published in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-10, Vol.52 (13), p.14624-14638
Main Authors: Li, Ronghui, Meng, Lu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3
cites cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3
container_end_page 14638
container_issue 13
container_start_page 14624
container_title Applied intelligence (Dordrecht, Netherlands)
container_volume 52
creator Li, Ronghui
Meng, Lu
description Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.
doi_str_mv 10.1007/s10489-022-03407-5
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2719936966</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2719936966</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJkybNURY_WfCggreQtJPStZuuSeviv7fuCt48DTM87zvwEHLO4JIBqKvEQJSaQp5T4AIULQ7IjBWKUyW0OiQz0LmgUuq3Y3KS0goAOAc2I4_PbROyzoZmtA1mEau-Ce3Q9iGzoc6GaEPq7G4POGz7-J45m7DOpsN67IaWfra4zWo72FNy5G2X8Ox3zsnr7c3L4p4un-4eFtdLWnGmB4plLTQKVirJCpSohebcoc-dKL0CVTmZe14g5-hdaRlw551lVjgNzAvkc3Kx793E_mPENJhVP8YwvTS5YlpzqaWcqHxPVbFPKaI3m9iubfwyDMyPM7N3ZiZnZufMFFOI70NpgkOD8a_6n9Q3uqZvlw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2719936966</pqid></control><display><type>article</type><title>Sign language recognition and translation network based on multi-view data</title><source>ABI/INFORM Collection</source><source>Springer Nature</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Li, Ronghui ; Meng, Lu</creator><creatorcontrib>Li, Ronghui ; Meng, Lu</creatorcontrib><description>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-022-03407-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial Intelligence ; Artificial neural networks ; Computer Science ; Datasets ; Feature extraction ; Hearing disorders ; Language translation ; Machines ; Manufacturing ; Mechanical Engineering ; Neural networks ; Processes ; Recognition ; Recurrent neural networks ; Sentences ; Sign language ; Special Issue on Multi-view Learning ; Translation methods and strategies</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2022-10, Vol.52 (13), p.14624-14638</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</citedby><cites>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</cites><orcidid>0000-0003-2442-8354</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2719936966/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2719936966?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11686,12849,27922,27923,31267,36058,44361,74665</link.rule.ids></links><search><creatorcontrib>Li, Ronghui</creatorcontrib><creatorcontrib>Meng, Lu</creatorcontrib><title>Sign language recognition and translation network based on multi-view data</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Hearing disorders</subject><subject>Language translation</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Neural networks</subject><subject>Processes</subject><subject>Recognition</subject><subject>Recurrent neural networks</subject><subject>Sentences</subject><subject>Sign language</subject><subject>Special Issue on Multi-view Learning</subject><subject>Translation methods and strategies</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><sourceid>M0C</sourceid><recordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJkybNURY_WfCggreQtJPStZuuSeviv7fuCt48DTM87zvwEHLO4JIBqKvEQJSaQp5T4AIULQ7IjBWKUyW0OiQz0LmgUuq3Y3KS0goAOAc2I4_PbROyzoZmtA1mEau-Ce3Q9iGzoc6GaEPq7G4POGz7-J45m7DOpsN67IaWfra4zWo72FNy5G2X8Ox3zsnr7c3L4p4un-4eFtdLWnGmB4plLTQKVirJCpSohebcoc-dKL0CVTmZe14g5-hdaRlw551lVjgNzAvkc3Kx793E_mPENJhVP8YwvTS5YlpzqaWcqHxPVbFPKaI3m9iubfwyDMyPM7N3ZiZnZufMFFOI70NpgkOD8a_6n9Q3uqZvlw</recordid><startdate>20221001</startdate><enddate>20221001</enddate><creator>Li, Ronghui</creator><creator>Meng, Lu</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0003-2442-8354</orcidid></search><sort><creationdate>20221001</creationdate><title>Sign language recognition and translation network based on multi-view data</title><author>Li, Ronghui ; Meng, Lu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Hearing disorders</topic><topic>Language translation</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Neural networks</topic><topic>Processes</topic><topic>Recognition</topic><topic>Recurrent neural networks</topic><topic>Sentences</topic><topic>Sign language</topic><topic>Special Issue on Multi-view Learning</topic><topic>Translation methods and strategies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Ronghui</creatorcontrib><creatorcontrib>Meng, Lu</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Ronghui</au><au>Meng, Lu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sign language recognition and translation network based on multi-view data</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2022-10-01</date><risdate>2022</risdate><volume>52</volume><issue>13</issue><spage>14624</spage><epage>14638</epage><pages>14624-14638</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-022-03407-5</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-2442-8354</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0924-669X
ispartof Applied intelligence (Dordrecht, Netherlands), 2022-10, Vol.52 (13), p.14624-14638
issn 0924-669X
1573-7497
language eng
recordid cdi_proquest_journals_2719936966
source ABI/INFORM Collection; Springer Nature; Linguistics and Language Behavior Abstracts (LLBA)
subjects Algorithms
Artificial Intelligence
Artificial neural networks
Computer Science
Datasets
Feature extraction
Hearing disorders
Language translation
Machines
Manufacturing
Mechanical Engineering
Neural networks
Processes
Recognition
Recurrent neural networks
Sentences
Sign language
Special Issue on Multi-view Learning
Translation methods and strategies
title Sign language recognition and translation network based on multi-view data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T12%3A23%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sign%20language%20recognition%20and%20translation%20network%20based%20on%20multi-view%20data&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Li,%20Ronghui&rft.date=2022-10-01&rft.volume=52&rft.issue=13&rft.spage=14624&rft.epage=14638&rft.pages=14624-14638&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-022-03407-5&rft_dat=%3Cproquest_cross%3E2719936966%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2719936966&rft_id=info:pmid/&rfr_iscdi=true