Loading…
Hierarchical convolutional neural networks with post-attention for speech emotion recognition
Speech emotion recognition (SER) is a key prerequisite for natural human–computer interaction. However, existing SER systems still face great challenges, particularly in the extraction of discriminative and high-quality emotional features. To address this challenge, this study proposes hc-former, a...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2025-01, Vol.615, p.128879, Article 128879 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Speech emotion recognition (SER) is a key prerequisite for natural human–computer interaction. However, existing SER systems still face great challenges, particularly in the extraction of discriminative and high-quality emotional features. To address this challenge, this study proposes hc-former, a hierarchical convolutional neural network (CNN) with post-attention. Unlike traditional CNNs and recurrent neural networks (RNNs), our model adeptly extracts potent class-discriminative features that integrate spatiotemporal information and long-term dependence. The class-discriminative features extracted by hc-former, which emphasize both interclass separation and intraclass compactness, can more effectively represent different class emotions often confused with one another, leading to superior classification results. Our experimental results further indicate the exceptional performance of hc-former for SER on benchmark datasets, surpassing other peer models in terms of performance while utilizing fewer parameters.
•hc-former,consists of hCNN and PA.•hCNN efficiently extracts spatiotemporal information.•PA captures vital long-term dependencies.•hc-former outperforms peer models with fewer parameters for SER task.•ER-F loss helps mitigate sample imbalance and improve SER accuracy. |
---|---|
ISSN: | 0925-2312 |
DOI: | 10.1016/j.neucom.2024.128879 |