Loading…

A robust and fast multispectral pedestrian detection deep network

Multispectral pedestrian detection is a difficult task, especially with pedestrian images of different sizes. In most convolutional neural network (CNN) models, the shared receptive fields of each layer are of the same size, which constrains detection results of multiple scales pedestrians. In this...

Full description

Saved in:
Bibliographic Details
Published in:Knowledge-based systems 2021-09, Vol.227, p.106990, Article 106990
Main Authors: Ding, Lu, Wang, Yong, Laganière, Robert, Huang, Dan, Luo, Xinbin, Zhang, Huanlong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multispectral pedestrian detection is a difficult task, especially with pedestrian images of different sizes. In most convolutional neural network (CNN) models, the shared receptive fields of each layer are of the same size, which constrains detection results of multiple scales pedestrians. In this paper, we propose a dynamic selection scheme to adaptive adjust receptive field size in multispectral pedestrian detection. Specifically, a network in network (NIN) is employed to combine visible and thermal information. Selective kernel networks (SKNets) which uses selective kernel unit with different kernel size are employed. To effectively fuse the feature representation in each layer, a build block is designed, in which different features are fused. Feature pyramid is employed to integrate feature information in each layer. We empirically show that our method outperforms existing 8 state-of-the-art methods on one multispectral dataset and 4 state-of-the-art methods on another multispectral dataset. Detailed analyses show that our proposed method can capture multispectral pedestrian detection of different scales, which confirms the effective of SKNets for adaptively resizing the receptive field sizes. In addition, our method can operate at 14 frames per second (fps) on a GPU.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2021.106990