Loading…
Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment
Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. Thes...
Saved in:
Published in: | Virtual reality : the journal of the Virtual Reality Society 2022-09, Vol.26 (3), p.1047-1057 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3 |
---|---|
cites | cdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3 |
container_end_page | 1057 |
container_issue | 3 |
container_start_page | 1047 |
container_title | Virtual reality : the journal of the Virtual Reality Society |
container_volume | 26 |
creator | Cha, Ho-Seung Chang, Won-Du Im, Chang-Hwan |
description | Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented. |
doi_str_mv | 10.1007/s10055-021-00616-0 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2708604330</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2708604330</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</originalsourceid><addsrcrecordid>eNp9kU1OwzAQhSMEEqVwAVaWWBvsOKmTJSq_UiU2sLZcZ9K6SuwydpB6GzZchJPhtEjs2Nij8XvfWPOy7JKza86YvAnpLEvKck4Zm_EZZUfZhBeioHVdyuNUi7KmhRDVaXYWwoYxkRdVMcm-7gC2tAONzroVXeoADUHQHY22BxJsBy6SsAUw69Q3fuVstN6RISQ9abWxuiPQgYno-51foe73OmwSSKMfXENgB4G0HslauybQFgGIdRFwtCeKdUR_f35YjEOCjdNt3BFwqeNdnz5wnp20ugtw8XtPs7eH-9f5E128PD7PbxfUCF5HCu3SFLLNDVSGy2o5LqDUNZdlmYOs6vTKOIimWNaFMDxPtZRNY9isFBxyLabZ1YG7Rf8-QIhq4wd0aaTKJatmLG2QJVV-UBn0ISC0aou217hTnKkxDnWIQ6U41D4ONZrEwRSS2K0A_9D_uH4AgZeSRA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2708604330</pqid></control><display><type>article</type><title>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</title><source>Springer Link</source><creator>Cha, Ho-Seung ; Chang, Won-Du ; Im, Chang-Hwan</creator><creatorcontrib>Cha, Ho-Seung ; Chang, Won-Du ; Im, Chang-Hwan</creatorcontrib><description>Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.</description><identifier>ISSN: 1359-4338</identifier><identifier>EISSN: 1434-9957</identifier><identifier>DOI: 10.1007/s10055-021-00616-0</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Computer Graphics ; Computer Science ; Electrodes ; Electromyography ; Headsets ; Image Processing and Computer Vision ; Machine learning ; Muscles ; Original Article ; Signal classification ; Speech recognition ; User Interfaces and Human Computer Interaction ; Virtual reality ; Voice recognition</subject><ispartof>Virtual reality : the journal of the Virtual Reality Society, 2022-09, Vol.26 (3), p.1047-1057</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</citedby><cites>FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</cites><orcidid>0000-0002-7437-4211 ; 0000-0002-9492-2318 ; 0000-0003-3795-3318</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Cha, Ho-Seung</creatorcontrib><creatorcontrib>Chang, Won-Du</creatorcontrib><creatorcontrib>Im, Chang-Hwan</creatorcontrib><title>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</title><title>Virtual reality : the journal of the Virtual Reality Society</title><addtitle>Virtual Reality</addtitle><description>Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Electrodes</subject><subject>Electromyography</subject><subject>Headsets</subject><subject>Image Processing and Computer Vision</subject><subject>Machine learning</subject><subject>Muscles</subject><subject>Original Article</subject><subject>Signal classification</subject><subject>Speech recognition</subject><subject>User Interfaces and Human Computer Interaction</subject><subject>Virtual reality</subject><subject>Voice recognition</subject><issn>1359-4338</issn><issn>1434-9957</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kU1OwzAQhSMEEqVwAVaWWBvsOKmTJSq_UiU2sLZcZ9K6SuwydpB6GzZchJPhtEjs2Nij8XvfWPOy7JKza86YvAnpLEvKck4Zm_EZZUfZhBeioHVdyuNUi7KmhRDVaXYWwoYxkRdVMcm-7gC2tAONzroVXeoADUHQHY22BxJsBy6SsAUw69Q3fuVstN6RISQ9abWxuiPQgYno-51foe73OmwSSKMfXENgB4G0HslauybQFgGIdRFwtCeKdUR_f35YjEOCjdNt3BFwqeNdnz5wnp20ugtw8XtPs7eH-9f5E128PD7PbxfUCF5HCu3SFLLNDVSGy2o5LqDUNZdlmYOs6vTKOIimWNaFMDxPtZRNY9isFBxyLabZ1YG7Rf8-QIhq4wd0aaTKJatmLG2QJVV-UBn0ISC0aou217hTnKkxDnWIQ6U41D4ONZrEwRSS2K0A_9D_uH4AgZeSRA</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Cha, Ho-Seung</creator><creator>Chang, Won-Du</creator><creator>Im, Chang-Hwan</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-7437-4211</orcidid><orcidid>https://orcid.org/0000-0002-9492-2318</orcidid><orcidid>https://orcid.org/0000-0003-3795-3318</orcidid></search><sort><creationdate>20220901</creationdate><title>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</title><author>Cha, Ho-Seung ; Chang, Won-Du ; Im, Chang-Hwan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Electrodes</topic><topic>Electromyography</topic><topic>Headsets</topic><topic>Image Processing and Computer Vision</topic><topic>Machine learning</topic><topic>Muscles</topic><topic>Original Article</topic><topic>Signal classification</topic><topic>Speech recognition</topic><topic>User Interfaces and Human Computer Interaction</topic><topic>Virtual reality</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cha, Ho-Seung</creatorcontrib><creatorcontrib>Chang, Won-Du</creatorcontrib><creatorcontrib>Im, Chang-Hwan</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Virtual reality : the journal of the Virtual Reality Society</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cha, Ho-Seung</au><au>Chang, Won-Du</au><au>Im, Chang-Hwan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</atitle><jtitle>Virtual reality : the journal of the Virtual Reality Society</jtitle><stitle>Virtual Reality</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>26</volume><issue>3</issue><spage>1047</spage><epage>1057</epage><pages>1047-1057</pages><issn>1359-4338</issn><eissn>1434-9957</eissn><abstract>Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10055-021-00616-0</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-7437-4211</orcidid><orcidid>https://orcid.org/0000-0002-9492-2318</orcidid><orcidid>https://orcid.org/0000-0003-3795-3318</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1359-4338 |
ispartof | Virtual reality : the journal of the Virtual Reality Society, 2022-09, Vol.26 (3), p.1047-1057 |
issn | 1359-4338 1434-9957 |
language | eng |
recordid | cdi_proquest_journals_2708604330 |
source | Springer Link |
subjects | Artificial Intelligence Artificial neural networks Computer Graphics Computer Science Electrodes Electromyography Headsets Image Processing and Computer Vision Machine learning Muscles Original Article Signal classification Speech recognition User Interfaces and Human Computer Interaction Virtual reality Voice recognition |
title | Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T19%3A23%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep-learning-based%20real-time%20silent%20speech%20recognition%20using%20facial%20electromyogram%20recorded%20around%20eyes%20for%20hands-free%20interfacing%20in%20a%C2%A0virtual%20reality%20environment&rft.jtitle=Virtual%20reality%20:%20the%20journal%20of%20the%20Virtual%20Reality%20Society&rft.au=Cha,%20Ho-Seung&rft.date=2022-09-01&rft.volume=26&rft.issue=3&rft.spage=1047&rft.epage=1057&rft.pages=1047-1057&rft.issn=1359-4338&rft.eissn=1434-9957&rft_id=info:doi/10.1007/s10055-021-00616-0&rft_dat=%3Cproquest_cross%3E2708604330%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2708604330&rft_id=info:pmid/&rfr_iscdi=true |