Loading…
Triple loss for hard face detection
•Based on FPN, a training strategy is introduced for face detection, which increases the accuracy without adding additional computation cost.•A feature fusion module is designed to enhance the capability of feature extraction from the fused features.•Achieving superior performance over a number of s...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2020-07, Vol.398, p.20-30 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Based on FPN, a training strategy is introduced for face detection, which increases the accuracy without adding additional computation cost.•A feature fusion module is designed to enhance the capability of feature extraction from the fused features.•Achieving superior performance over a number of state-of-the-art methods on the hard face detection while reaching a balance between the accuracy and speed.
Although face detection has been well addressed in the last decades, despite the achievements in recent years, effective detection of small, blurred and partially occluded faces in the wild remains a challenging task. Meanwhile, the trade-off between computational cost and accuracy is also an open research problem in this context. To tackle these challenges, in this paper, a novel context enhanced approach is proposed with structural optimization and loss function optimization. For loss function optimization, we introduce a hierarchical loss, referring to ``triple loss'' in this paper, to optimize the feature pyramid network (FPN) (Lin et al., 2017) based face detector. Additional layers are only applied during the training process. As a result, the computational cost is the same as FPN during inference. For structural optimization, we propose a context sensitive structure to increase the capacity of the prediction network to improve the accuracy of the output. In details, a three-branch inception subnet (Szegedy et al., 2015) based feature fusion module is employed to refine the original FPN without increasing the computational cost significantly, further improving low-level semantic information, which is originally extracted from a single convolutional layer in the backward pathway of FPN. The proposed approach is evaluated on two publicly available face detection benchmarks, FDDB and WIDER FACE. By using a VGG-16 based detector, experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2020.02.060 |