Loading…

Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment

Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. Thes...

Full description

Saved in:
Bibliographic Details
Published in:Virtual reality : the journal of the Virtual Reality Society 2022-09, Vol.26 (3), p.1047-1057
Main Authors: Cha, Ho-Seung, Chang, Won-Du, Im, Chang-Hwan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3
cites cdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3
container_end_page 1057
container_issue 3
container_start_page 1047
container_title Virtual reality : the journal of the Virtual Reality Society
container_volume 26
creator Cha, Ho-Seung
Chang, Won-Du
Im, Chang-Hwan
description Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.
doi_str_mv 10.1007/s10055-021-00616-0
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2708604330</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2708604330</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</originalsourceid><addsrcrecordid>eNp9kU1OwzAQhSMEEqVwAVaWWBvsOKmTJSq_UiU2sLZcZ9K6SuwydpB6GzZchJPhtEjs2Nij8XvfWPOy7JKza86YvAnpLEvKck4Zm_EZZUfZhBeioHVdyuNUi7KmhRDVaXYWwoYxkRdVMcm-7gC2tAONzroVXeoADUHQHY22BxJsBy6SsAUw69Q3fuVstN6RISQ9abWxuiPQgYno-51foe73OmwSSKMfXENgB4G0HslauybQFgGIdRFwtCeKdUR_f35YjEOCjdNt3BFwqeNdnz5wnp20ugtw8XtPs7eH-9f5E128PD7PbxfUCF5HCu3SFLLNDVSGy2o5LqDUNZdlmYOs6vTKOIimWNaFMDxPtZRNY9isFBxyLabZ1YG7Rf8-QIhq4wd0aaTKJatmLG2QJVV-UBn0ISC0aou217hTnKkxDnWIQ6U41D4ONZrEwRSS2K0A_9D_uH4AgZeSRA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2708604330</pqid></control><display><type>article</type><title>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</title><source>Springer Link</source><creator>Cha, Ho-Seung ; Chang, Won-Du ; Im, Chang-Hwan</creator><creatorcontrib>Cha, Ho-Seung ; Chang, Won-Du ; Im, Chang-Hwan</creatorcontrib><description>Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.</description><identifier>ISSN: 1359-4338</identifier><identifier>EISSN: 1434-9957</identifier><identifier>DOI: 10.1007/s10055-021-00616-0</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Computer Graphics ; Computer Science ; Electrodes ; Electromyography ; Headsets ; Image Processing and Computer Vision ; Machine learning ; Muscles ; Original Article ; Signal classification ; Speech recognition ; User Interfaces and Human Computer Interaction ; Virtual reality ; Voice recognition</subject><ispartof>Virtual reality : the journal of the Virtual Reality Society, 2022-09, Vol.26 (3), p.1047-1057</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</citedby><cites>FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</cites><orcidid>0000-0002-7437-4211 ; 0000-0002-9492-2318 ; 0000-0003-3795-3318</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Cha, Ho-Seung</creatorcontrib><creatorcontrib>Chang, Won-Du</creatorcontrib><creatorcontrib>Im, Chang-Hwan</creatorcontrib><title>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</title><title>Virtual reality : the journal of the Virtual Reality Society</title><addtitle>Virtual Reality</addtitle><description>Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Computer Graphics</subject><subject>Computer Science</subject><subject>Electrodes</subject><subject>Electromyography</subject><subject>Headsets</subject><subject>Image Processing and Computer Vision</subject><subject>Machine learning</subject><subject>Muscles</subject><subject>Original Article</subject><subject>Signal classification</subject><subject>Speech recognition</subject><subject>User Interfaces and Human Computer Interaction</subject><subject>Virtual reality</subject><subject>Voice recognition</subject><issn>1359-4338</issn><issn>1434-9957</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kU1OwzAQhSMEEqVwAVaWWBvsOKmTJSq_UiU2sLZcZ9K6SuwydpB6GzZchJPhtEjs2Nij8XvfWPOy7JKza86YvAnpLEvKck4Zm_EZZUfZhBeioHVdyuNUi7KmhRDVaXYWwoYxkRdVMcm-7gC2tAONzroVXeoADUHQHY22BxJsBy6SsAUw69Q3fuVstN6RISQ9abWxuiPQgYno-51foe73OmwSSKMfXENgB4G0HslauybQFgGIdRFwtCeKdUR_f35YjEOCjdNt3BFwqeNdnz5wnp20ugtw8XtPs7eH-9f5E128PD7PbxfUCF5HCu3SFLLNDVSGy2o5LqDUNZdlmYOs6vTKOIimWNaFMDxPtZRNY9isFBxyLabZ1YG7Rf8-QIhq4wd0aaTKJatmLG2QJVV-UBn0ISC0aou217hTnKkxDnWIQ6U41D4ONZrEwRSS2K0A_9D_uH4AgZeSRA</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Cha, Ho-Seung</creator><creator>Chang, Won-Du</creator><creator>Im, Chang-Hwan</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7XB</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-7437-4211</orcidid><orcidid>https://orcid.org/0000-0002-9492-2318</orcidid><orcidid>https://orcid.org/0000-0003-3795-3318</orcidid></search><sort><creationdate>20220901</creationdate><title>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</title><author>Cha, Ho-Seung ; Chang, Won-Du ; Im, Chang-Hwan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Computer Graphics</topic><topic>Computer Science</topic><topic>Electrodes</topic><topic>Electromyography</topic><topic>Headsets</topic><topic>Image Processing and Computer Vision</topic><topic>Machine learning</topic><topic>Muscles</topic><topic>Original Article</topic><topic>Signal classification</topic><topic>Speech recognition</topic><topic>User Interfaces and Human Computer Interaction</topic><topic>Virtual reality</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cha, Ho-Seung</creatorcontrib><creatorcontrib>Chang, Won-Du</creatorcontrib><creatorcontrib>Im, Chang-Hwan</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Virtual reality : the journal of the Virtual Reality Society</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cha, Ho-Seung</au><au>Chang, Won-Du</au><au>Im, Chang-Hwan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment</atitle><jtitle>Virtual reality : the journal of the Virtual Reality Society</jtitle><stitle>Virtual Reality</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>26</volume><issue>3</issue><spage>1047</spage><epage>1057</epage><pages>1047-1057</pages><issn>1359-4338</issn><eissn>1434-9957</eissn><abstract>Speech recognition technology is a promising hands-free interfacing modality for virtual reality (VR) applications. However, it has several drawbacks, such as limited usability in a noisy environment or a public place and limited accessibility to those who cannot generate loud and clear voices. These limitations may be overcome by employing a silent speech recognition (SSR) technology utilizing facial electromyograms (fEMGs) in a VR environment. In the conventional SSR systems, however, fEMG electrodes were attached around the user’s lips and neck, thereby creating new practical issues, such as the requirement of an additional wearable system besides the VR headset, necessity of a complex and time-consuming procedure for attaching the fEMG electrodes, and discomfort and limited facial muscle movements of the user. To solve these problems, we propose an SSR system using fEMGs measured by a few electrodes attached around the eyes of a user, which can also be easily incorporated into available VR headsets. To enhance the accuracy of classifying the fEMG signals recorded from limited recording locations relatively far from the phonatory organs, a deep neural network-based classification method was developed using similar fEMG data previously collected from other individuals and then transformed by dynamic positional warping. In the experiments, the proposed SSR system could classify six different fEMG patterns generated by six silently spoken words with an accuracy of 92.53%. To further demonstrate that our SSR system can be used as a hands-free control interface in practical VR applications, an online SSR system was implemented.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s10055-021-00616-0</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-7437-4211</orcidid><orcidid>https://orcid.org/0000-0002-9492-2318</orcidid><orcidid>https://orcid.org/0000-0003-3795-3318</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1359-4338
ispartof Virtual reality : the journal of the Virtual Reality Society, 2022-09, Vol.26 (3), p.1047-1057
issn 1359-4338
1434-9957
language eng
recordid cdi_proquest_journals_2708604330
source Springer Link
subjects Artificial Intelligence
Artificial neural networks
Computer Graphics
Computer Science
Electrodes
Electromyography
Headsets
Image Processing and Computer Vision
Machine learning
Muscles
Original Article
Signal classification
Speech recognition
User Interfaces and Human Computer Interaction
Virtual reality
Voice recognition
title Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T19%3A23%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep-learning-based%20real-time%20silent%20speech%20recognition%20using%20facial%20electromyogram%20recorded%20around%20eyes%20for%20hands-free%20interfacing%20in%20a%C2%A0virtual%20reality%20environment&rft.jtitle=Virtual%20reality%20:%20the%20journal%20of%20the%20Virtual%20Reality%20Society&rft.au=Cha,%20Ho-Seung&rft.date=2022-09-01&rft.volume=26&rft.issue=3&rft.spage=1047&rft.epage=1057&rft.pages=1047-1057&rft.issn=1359-4338&rft.eissn=1434-9957&rft_id=info:doi/10.1007/s10055-021-00616-0&rft_dat=%3Cproquest_cross%3E2708604330%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-efbc47f2ce8c178b14345a917552e789bc401e3d4b943c121e377ddc06531e2a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2708604330&rft_id=info:pmid/&rfr_iscdi=true