Loading…
Transformer-based hand gesture recognition from instantaneous to fused neural decomposition of high-density EMG signals
Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a Compact Transformer-based Hand Gesture Recognition framework referred to as CT-HGR , w...
Saved in:
Published in: | Scientific reports 2023-07, Vol.13 (1), p.11000-11000, Article 11000 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a Compact Transformer-based Hand Gesture Recognition framework referred to as
CT-HGR
, which employs a vision transformer network to conduct hand gesture recognition using high-density surface EMG (HD-sEMG) signals. Taking advantage of the attention mechanism, which is incorporated into the transformer architectures, our proposed
CT-HGR
framework overcomes major constraints associated with most of the existing deep learning models such as model complexity; requiring feature engineering; inability to consider both temporal and spatial information of HD-sEMG signals, and requiring a large number of training samples. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths.
CT-HGR
can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the
CT-HGR
framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the
CT-HGR
is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed
CT-HGR
framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. Our results are obtained via 5-fold cross-validation by first applying the proposed framework on the dataset of each subject separately and then, averaging the accuracies among all the subjects. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The
CT-HGR
achieves accuracy of 89.13% for instantaneous recognition base |
---|---|
ISSN: | 2045-2322 2045-2322 |
DOI: | 10.1038/s41598-023-36490-w |