Loading…

Hilbert–Huang–Hurst-based non-linear acoustic feature vector for emotion classification with stochastic models and learning systems

This study presents a widespread analysis of affective vocal expression classification systems. In this study, the Hilbert–Huang–Hurst coefficient (HHHC) vector is proposed as a non-linear vocal source feature to represent the emotional states according to their effects on the speech production mech...

Full description

Saved in:

Bibliographic Details
Published in:	IET signal processing 2020-10, Vol.14 (8), p.522-532
Main Authors:	Vieira, Vinícius, Coelho, Rosângela, de Assis, Francisco Marcos
Format:	Article
Language:	English
Subjects:	acoustic feature vectors acoustic signal processing affective computing affective vocal expression classification systems alpha $α‐GMM alpha $α‐integrated Gaussian mixture model eGeMAPS feature set emotion recognition emotion representation empirical mode decomposition English language Gaussian mixture model Gaussian processes GeMAPS feature set German language HHHC Hilbert transforms Hilbert–Huang–Hurst coefficient vector Hilbert–Huang–Hurst‐based nonlinear acoustic feature vector index of nonstationarity learning (artificial intelligence) learning systems machine learning classifiers mixture models nonlinear vocal source feature Research Article signal classification signal representation speech emotion classification experiments speech enhancement speech production mechanism speech recognition stochastic classifiers stochastic models stochastic processes α‐GMM
Citations:	Items that this one cites Items that cite this one
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study presents a widespread analysis of affective vocal expression classification systems. In this study, the Hilbert–Huang–Hurst coefficient (HHHC) vector is proposed as a non-linear vocal source feature to represent the emotional states according to their effects on the speech production mechanism. Affective states are highlighted by the empirical mode decomposition-based method, which exploits the non-stationarity of the acoustic variations. Hurst coefficients are then estimated from the decomposition modes to form the feature vector. Additionally, a vector of the index of non-stationarity (INS) is introduced as dynamic information to the HHHC. The proposed feature vector is evaluated in speech emotion classification experiments with three databases in German and English languages. Three state-of-the-art acoustic feature vectors are adopted as a baseline. The $\alpha $α-integrated Gaussian mixture model ($\alpha $α-GMM) is also introduced for the emotion representation and classification. Its performance is compared to competing for stochastic and machine learning classifiers. Results demonstrate that the HHHC leads to significant classification improvement when compared to the baseline acoustic feature vectors. Moreover, results also show that the $\alpha $α-GMM outperforms the competing classification methods. Finally, the complementarity aspects of HHHC and INS are also evaluated for the GeMAPS and eGeMAPS feature sets.
ISSN:	1751-9675 1751-9683 1751-9683
DOI:	10.1049/iet-spr.2019.0383