Loading…

Robust classification of face and head gestures in video

Automatic analysis of head gestures and facial expressions is a challenging research area and it has significant applications in human-computer interfaces. We develop a face and head gesture detector in video streams. The detector is based on face landmark paradigm in that appearance and configurati...

Full description

Saved in:
Bibliographic Details
Published in:Image and vision computing 2011-06, Vol.29 (7), p.470-483
Main Authors: Akakın, Hatice Çınar, Sankur, Bülent
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic analysis of head gestures and facial expressions is a challenging research area and it has significant applications in human-computer interfaces. We develop a face and head gesture detector in video streams. The detector is based on face landmark paradigm in that appearance and configuration information of landmarks are used. First we detect and track accurately facial landmarks using adaptive templates, Kalman predictor and subspace regularization. Then the trajectories (time series) of facial landmark positions during the course of the head gesture or facial expression are converted in various discriminative features. Features can be landmark coordinate time series, facial geometric features or patches on expressive regions of the face. We use comparatively, two feature sequence classifiers, that is, Hidden Markov Models (HMM) and Hidden Conditional Random Fields (HCRF), and various feature subspace classifiers, that is, ICA (Independent Component Analysis) and NMF (Non-negative Matrix Factorization) on the spatiotemporal data. We achieve 87.3% correct gesture classification on a seven-gesture test database, and the performance reaches 98.2% correct detection under a fusion scheme. Promising and competitive results are also achieved on classification of naturally occurring gesture clips of LIlir TwoTalk Corpus. [Display omitted] ► Landmark tracking with Kalman filter, PCA regularizer and dynamic template library. ► The exploration of sparse transforms on the time-space landscape of face landmarks. ► Subspace-based and sequence-based comparative analysis of time series. ► Eventual fusion of subspace-based and sequence-based classifiers.
ISSN:0262-8856
1872-8138
DOI:10.1016/j.imavis.2011.03.001