Loading…

Transformer-based network with temporal depthwise convolutions for sEMG recognition

Considerable progress has been made in pattern recognition of surface electromyography (sEMG) with deep learning, bringing improvements to sEMG-based gesture classification. Current deep learning techniques are mainly based on convolutional neural networks (CNNs), recurrent neural networks (RNNs), a...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2024-01, Vol.145, p.109967, Article 109967
Main Authors: Wang, Zefeng, Yao, Junfeng, Xu, Meiyan, Jiang, Min, Su, Jinsong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Considerable progress has been made in pattern recognition of surface electromyography (sEMG) with deep learning, bringing improvements to sEMG-based gesture classification. Current deep learning techniques are mainly based on convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their hybrids. However, CNNs focus on spatial and local information, while RNNs are unparallelizable, and they suffer from gradient vanishing/exploding. Their hybrids often face problems of model complexity and high computational cost. Because sEMG signals have a sequential nature, motivated by the sequence modeling network Transformer and its self-attention mechanism, we propose a Transformer-based network, temporal depthwise convolutional Transformer (TDCT), for sparse sEMG recognition. With this network, higher recognition accuracy is achieved with fewer convolution parameters and a lower computational cost. Specifically, this network has parallel capability and can capture long-range features inside sEMG signals. We improve the locality and channel correlation capture of multi-head self-attention (MSA) for sEMG modeling by replacing the linear transformation with the proposed temporal depthwise convolution (TDC), which can reduce the convolution parameters and computations for feature learning performance. Four sEMG datasets, Ninapro DB1, DB2, DB5, and OYDB, are used for evaluations and comparisons. In the results, our model outperforms other methods, including Transformer-based networks, in most windows at recognizing the raw signals of sparse sEMG, thus achieving state-of-the-art classification accuracy. •Global features of sEMG signals can be captured better by our proposed network.•We improve local feature extraction with temporal convolutions in each sEMG channel.•We adopt depthwise separable convolution to improve channel correlation capture.•We reduce the parameters and computations through our proposed attention mechanism.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.109967