Loading…

Triple loss for hard face detection

•Based on FPN, a training strategy is introduced for face detection, which increases the accuracy without adding additional computation cost.•A feature fusion module is designed to enhance the capability of feature extraction from the fused features.•Achieving superior performance over a number of s...

Full description

Saved in:

Bibliographic Details
Published in:	Neurocomputing (Amsterdam) 2020-07, Vol.398, p.20-30
Main Authors:	Fang, Zhenyu, Ren, Jinchang, Marshall, Stephen, Zhao, Huimin, Wang, Zheng, Huang, Kaizhu, Xiao, Bing
Format:	Article
Language:	English
Subjects:	Efficiency-accuracy balance Face detection Face feature fusion Single shot detection Small face
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Based on FPN, a training strategy is introduced for face detection, which increases the accuracy without adding additional computation cost.•A feature fusion module is designed to enhance the capability of feature extraction from the fused features.•Achieving superior performance over a number of state-of-the-art methods on the hard face detection while reaching a balance between the accuracy and speed. Although face detection has been well addressed in the last decades, despite the achievements in recent years, effective detection of small, blurred and partially occluded faces in the wild remains a challenging task. Meanwhile, the trade-off between computational cost and accuracy is also an open research problem in this context. To tackle these challenges, in this paper, a novel context enhanced approach is proposed with structural optimization and loss function optimization. For loss function optimization, we introduce a hierarchical loss, referring to ``triple loss'' in this paper, to optimize the feature pyramid network (FPN) (Lin et al., 2017) based face detector. Additional layers are only applied during the training process. As a result, the computational cost is the same as FPN during inference. For structural optimization, we propose a context sensitive structure to increase the capacity of the prediction network to improve the accuracy of the output. In details, a three-branch inception subnet (Szegedy et al., 2015) based feature fusion module is employed to refine the original FPN without increasing the computational cost significantly, further improving low-level semantic information, which is originally extracted from a single convolutional layer in the backward pathway of FPN. The proposed approach is evaluated on two publicly available face detection benchmarks, FDDB and WIDER FACE. By using a VGG-16 based detector, experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2020.02.060