Loading…

Multi-stream CNN for facial expression recognition in limited training data

Limited data is a challenging problem to train Convolutional Neural Networks. On the other hand, acquiring a database in a demanded scale is not a straightforward task. In this paper, handcrafted features along with a multi-stream structure are proposed as a solution to improve performance of limite...

Full description

Saved in:

Bibliographic Details
Published in:	Multimedia tools and applications 2019-08, Vol.78 (16), p.22861-22882
Main Authors:	Aghamaleki, Javad Abbasi, Ashkani Chenarlogh, Vahid
Format:	Article
Language:	English
Subjects:	Artificial neural networks Binary codes Computer Communication Networks Computer Science Data Structures and Information Theory Edge detection Face recognition Feature extraction Image detection Multimedia Information Systems Performance enhancement Special Purpose and Application-Based Systems Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Limited data is a challenging problem to train Convolutional Neural Networks. On the other hand, acquiring a database in a demanded scale is not a straightforward task. In this paper, handcrafted features along with a multi-stream structure are proposed as a solution to improve performance of limited data via CNN. Three handcrafted features using local binary pattern code extractor and Sobel edge detection operator in horizontal and vertical directions of images have been extracted to apply to the multi-stream CNN model. Our model is based on two distinct structures including three-stream and single-stream structures. The three-stream structure can be employed to improve the recognition rate in facial expression classifiers when the training data is limited. In three-stream structure, each of information channels will be added to distinct streams separately. Furthermore, the transfer learning technique employed and behaviour of VGG16 architecture trained with limited data have been studied to be compared with the proposed method. In addition, input data is expanded by means of rotation, cropping, and flipping. Next, three-stream and single-stream structures are examined while using limited and also expanded training data. We have evaluated the mentioned system in order to compare it with state of the arts for CK+ and MUG databases in both limited-data and expanded-data. The results indicate that by using limited-data, recognition accuracy will be improved through the mentioned strategy. (92.19 to 88.95 in CK+ database and 85.4 to 82.5 in MUG database). Additionally, the performance was improved in comparison with benchmark methods.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-019-7530-7