Loading…

Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features

Recently, increasing attention has been directed to study and identify the emotional content of a spoken utterance. This study introduces a method to improve emotion classification performance under clean and noisy environments by combining two types of features: the proposed neural-responses-based...

Full description

Saved in:
Bibliographic Details
Published in:IET signal processing 2017-07, Vol.11 (5), p.587-595
Main Authors: Jassim, Wissam A, Paramesran, Raveendran, Harte, Naomi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, increasing attention has been directed to study and identify the emotional content of a spoken utterance. This study introduces a method to improve emotion classification performance under clean and noisy environments by combining two types of features: the proposed neural-responses-based features and the traditional INTERSPEECH 2010 paralinguistic emotion challenge features. The neural-responses-based features are represented by the responses of a computational model of the auditory system for listeners with normal hearing. The model simulates the responses of an auditory-nerve fibre with a characteristic frequency to a speech signal. The simulated responses of the model are represented by the 2D neurogram (time-frequency representation). The neurogram image is sub-divided into non-overlapped blocks and the averaged value of each block is computed. The neurogram features and the traditional emotion features are combined together to form the feature vector for each speech signal. The features are trained using support vector machines to predict the emotion of speech. The performance of the proposed method is evaluated on two well-known databases: the eNTERFACE and Berlin emotional speech data set. The results show that the proposed method performed better when compared with the classification results obtained using neurogram and INTERSPEECH features separately.
ISSN:1751-9675
1751-9683
1751-9683
DOI:10.1049/iet-spr.2016.0336