Loading…
Sign language recognition and translation network based on multi-view data
Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convo...
Saved in:
Published in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-10, Vol.52 (13), p.14624-14638 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3 |
---|---|
cites | cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3 |
container_end_page | 14638 |
container_issue | 13 |
container_start_page | 14624 |
container_title | Applied intelligence (Dordrecht, Netherlands) |
container_volume | 52 |
creator | Li, Ronghui Meng, Lu |
description | Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets. |
doi_str_mv | 10.1007/s10489-022-03407-5 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2719936966</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2719936966</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJkybNURY_WfCggreQtJPStZuuSeviv7fuCt48DTM87zvwEHLO4JIBqKvEQJSaQp5T4AIULQ7IjBWKUyW0OiQz0LmgUuq3Y3KS0goAOAc2I4_PbROyzoZmtA1mEau-Ce3Q9iGzoc6GaEPq7G4POGz7-J45m7DOpsN67IaWfra4zWo72FNy5G2X8Ox3zsnr7c3L4p4un-4eFtdLWnGmB4plLTQKVirJCpSohebcoc-dKL0CVTmZe14g5-hdaRlw551lVjgNzAvkc3Kx793E_mPENJhVP8YwvTS5YlpzqaWcqHxPVbFPKaI3m9iubfwyDMyPM7N3ZiZnZufMFFOI70NpgkOD8a_6n9Q3uqZvlw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2719936966</pqid></control><display><type>article</type><title>Sign language recognition and translation network based on multi-view data</title><source>ABI/INFORM Collection</source><source>Springer Nature</source><source>Linguistics and Language Behavior Abstracts (LLBA)</source><creator>Li, Ronghui ; Meng, Lu</creator><creatorcontrib>Li, Ronghui ; Meng, Lu</creatorcontrib><description>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-022-03407-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial Intelligence ; Artificial neural networks ; Computer Science ; Datasets ; Feature extraction ; Hearing disorders ; Language translation ; Machines ; Manufacturing ; Mechanical Engineering ; Neural networks ; Processes ; Recognition ; Recurrent neural networks ; Sentences ; Sign language ; Special Issue on Multi-view Learning ; Translation methods and strategies</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2022-10, Vol.52 (13), p.14624-14638</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</citedby><cites>FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</cites><orcidid>0000-0003-2442-8354</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2719936966/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2719936966?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11686,12849,27922,27923,31267,36058,44361,74665</link.rule.ids></links><search><creatorcontrib>Li, Ronghui</creatorcontrib><creatorcontrib>Meng, Lu</creatorcontrib><title>Sign language recognition and translation network based on multi-view data</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Hearing disorders</subject><subject>Language translation</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Neural networks</subject><subject>Processes</subject><subject>Recognition</subject><subject>Recurrent neural networks</subject><subject>Sentences</subject><subject>Sign language</subject><subject>Special Issue on Multi-view Learning</subject><subject>Translation methods and strategies</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>7T9</sourceid><sourceid>M0C</sourceid><recordid>eNp9kE1LxDAQhoMouK7-AU8Fz9FJkybNURY_WfCggreQtJPStZuuSeviv7fuCt48DTM87zvwEHLO4JIBqKvEQJSaQp5T4AIULQ7IjBWKUyW0OiQz0LmgUuq3Y3KS0goAOAc2I4_PbROyzoZmtA1mEau-Ce3Q9iGzoc6GaEPq7G4POGz7-J45m7DOpsN67IaWfra4zWo72FNy5G2X8Ox3zsnr7c3L4p4un-4eFtdLWnGmB4plLTQKVirJCpSohebcoc-dKL0CVTmZe14g5-hdaRlw551lVjgNzAvkc3Kx793E_mPENJhVP8YwvTS5YlpzqaWcqHxPVbFPKaI3m9iubfwyDMyPM7N3ZiZnZufMFFOI70NpgkOD8a_6n9Q3uqZvlw</recordid><startdate>20221001</startdate><enddate>20221001</enddate><creator>Li, Ronghui</creator><creator>Meng, Lu</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7T9</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0003-2442-8354</orcidid></search><sort><creationdate>20221001</creationdate><title>Sign language recognition and translation network based on multi-view data</title><author>Li, Ronghui ; Meng, Lu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Hearing disorders</topic><topic>Language translation</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Neural networks</topic><topic>Processes</topic><topic>Recognition</topic><topic>Recurrent neural networks</topic><topic>Sentences</topic><topic>Sign language</topic><topic>Special Issue on Multi-view Learning</topic><topic>Translation methods and strategies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Ronghui</creatorcontrib><creatorcontrib>Meng, Lu</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Database (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Ronghui</au><au>Meng, Lu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sign language recognition and translation network based on multi-view data</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2022-10-01</date><risdate>2022</risdate><volume>52</volume><issue>13</issue><spage>14624</spage><epage>14638</epage><pages>14624-14638</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>Sign language recognition and translation can address the communication problem between hearing-impaired and general population, and can break the sign language boundariesy between different countries and different languages. Traditional sign language recognition and translation algorithms use Convolutional Neural Networks (CNNs) to extract spatial features and Recurrent Neural Networks (RNNs) to extract temporal features. However, these methods cannot model the complex spatiotemporal features of sign language. Moreover, RNN and its variant algorithms find it difficult to learn long-term dependencies. This paper proposes a novel and effective network based on Transformer and Graph Convolutional Network (GCN), which can be divided into three parts: a multi-view spatiotemporal embedding network (MSTEN), a continuous sign language recognition network (CSLRN), and a sign language translation network (SLTN). MSTEN can extract the spatiotemporal features of RGB data and skeleton data. CSLRN can recognize sign language glosses and obtain intermediate features from multi-view input sign data. SLTN can translate intermediate features into spoken sentences. The entire network was designed as end-to-end. Our method was tested on three public sign language datasets (SLR-100, RWTH, and CSL-daily) and the results demonstrated that our method achieved excellent performance on these datasets.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-022-03407-5</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-2442-8354</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0924-669X |
ispartof | Applied intelligence (Dordrecht, Netherlands), 2022-10, Vol.52 (13), p.14624-14638 |
issn | 0924-669X 1573-7497 |
language | eng |
recordid | cdi_proquest_journals_2719936966 |
source | ABI/INFORM Collection; Springer Nature; Linguistics and Language Behavior Abstracts (LLBA) |
subjects | Algorithms Artificial Intelligence Artificial neural networks Computer Science Datasets Feature extraction Hearing disorders Language translation Machines Manufacturing Mechanical Engineering Neural networks Processes Recognition Recurrent neural networks Sentences Sign language Special Issue on Multi-view Learning Translation methods and strategies |
title | Sign language recognition and translation network based on multi-view data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T12%3A23%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sign%20language%20recognition%20and%20translation%20network%20based%20on%20multi-view%20data&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Li,%20Ronghui&rft.date=2022-10-01&rft.volume=52&rft.issue=13&rft.spage=14624&rft.epage=14638&rft.pages=14624-14638&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-022-03407-5&rft_dat=%3Cproquest_cross%3E2719936966%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-e8d49e4187615e6e94933bef2b48f707cb62f35e33efb8a103bfba1a4b901f4e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2719936966&rft_id=info:pmid/&rfr_iscdi=true |