Loading…

Sparse Kernel Reduced-Rank Regression for Bimodal Emotion Recognition From Facial Expression and Speech

A novel bimodal emotion recognition approach from facial expression and speech based on the sparse kernel reduced-rank regression (SKRRR) fusion method is proposed in this paper. In this method, we use the openSMILE feature extractor and the scale invariant feature transform feature descriptor to re...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on multimedia 2016-07, Vol.18 (7), p.1319-1329
Main Authors:	Yan, Jingjie, Zheng, Wenming, Xu, Qinyu, Lu, Guanming, Li, Haibo, Wang, Bei
Format:	Article
Language:	English
Subjects:	Bimodal emotion recognition Emotion recognition Emotions Face recognition Facial expression Feature extraction Feature fusion Kernel Nonlinearity Optimization Recognition Regression Sparse kernel reduced-rank regression (SKRRR) Sparse matrices Speech Speech recognition
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	A novel bimodal emotion recognition approach from facial expression and speech based on the sparse kernel reduced-rank regression (SKRRR) fusion method is proposed in this paper. In this method, we use the openSMILE feature extractor and the scale invariant feature transform feature descriptor to respectively extract effective features from speech modality and facial expression modality, and then propose the SKRRR fusion approach to fuse the emotion features of two modalities. The proposed SKRRR method is a nonlinear extension of the traditional reduced-rank regression (RRR), where both predictor and response feature vectors in RRR are kernelized by being mapped onto two high-dimensional feature space via two nonlinear mappings, respectively. To solve the SKRRR problem, we propose a sparse representation (SR)-based approach to find the optimal solution of the coefficient matrices of SKRRR, where the introduction of the SR technique aims to fully consider the different contributions of training data samples to the derivation of optimal solution of SKRRR. Finally, we utilize the eNTERFACE '05 and AFEW 4.0 bimodal emotion database to conduct the experiments of monomodal emotion recognition and bimodal emotion recognition, and the results indicate that our presented approach acquires the highest or comparable bimodal emotion recognition rate among some state-of-the-art approaches.
ISSN:	1520-9210 1941-0077 1941-0077
DOI:	10.1109/TMM.2016.2557721