Loading…
Sentiment analysis in non-fixed length audios using a Fully Convolutional Neural Network
•A new method based on fully convolutional neural network has been proposed for speech emotion recognition.•Our proposal can process variable input lengths enabling near real time sentiment analysis.•Mel-frequency cepstral coefficients makes it easier to identify emotions in audio signals.•Fully con...
Saved in:
Published in: | Biomedical signal processing and control 2021-08, Vol.69, p.102946, Article 102946 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A new method based on fully convolutional neural network has been proposed for speech emotion recognition.•Our proposal can process variable input lengths enabling near real time sentiment analysis.•Mel-frequency cepstral coefficients makes it easier to identify emotions in audio signals.•Fully convolutional neural networks have shown better performance than other machine learning models considering EmoDB, Ravdess and TESS data sets.
In this work, a sentiment analysis method that is capable of accepting audio of any length, without being fixed a priori, is proposed. Mel spectrogram and Mel Frequency Cepstral Coefficients are used as audio description methods and a Fully Convolutional Neural Network architecture is proposed as a classifier. The results have been validated using three well known datasets: EMODB, RAVDESS and TESS. The results obtained were promising, outperforming the state-of–the-art methods. Also, thanks to the fact that the proposed method admits audios of any size, it allows a sentiment analysis to be made in near real time, which is very interesting for a wide range of fields such as call centers, medical consultations or financial brokers. |
---|---|
ISSN: | 1746-8094 1746-8108 |
DOI: | 10.1016/j.bspc.2021.102946 |