Loading…

Feature Fusion and Center Aggregation for Visible-Infrared Person Re-Identification

The visible-infrared pedestrian re- identification (VI Re-ID) task aims to match cross-modality pedestrian images with the same labels. Most current methods focus on mitigating the modality discrepancy by adopting a two-stream network and identity supervision. Based on current methods, we propose a...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2022, Vol.10, p.30949-30958
Main Authors:	Wang, Xianju, Chen, Cuiqun, Zhu, Yong, Chen, Shuguang
Format:	Article
Language:	English
Subjects:	Agglomeration Computer networks cross-modality Datasets Feature extraction Feature maps Generative adversarial networks Infrared imaging Learning Measurement modality discrepancy Modules Physics Re-ID Surveillance Visible-infrared
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The visible-infrared pedestrian re- identification (VI Re-ID) task aims to match cross-modality pedestrian images with the same labels. Most current methods focus on mitigating the modality discrepancy by adopting a two-stream network and identity supervision. Based on current methods, we propose a novel feature fusion and center aggregation learning network ( F^{2} CALNet) for cross-modality pedestrian re- identification. F^{2} CALNet focuses on learning modality-irrelevant features by simultaneously reducing inter-modality discrepancies and increasing the inter-identity variations in a single framework. Specifically, we first adopt a two-stream backbone network to extract modality-independent and modality-shared information. Then, we embed modality mitigation modules in a two-stream network to learn feature maps that are stripped of the modality information. Finally, we devise a feature fusion and center aggregation learning module, which first merges two different granularity features to learn distinguishing features, then, we organize two kinds of center-based loss functions to reduce the intra-identity inter- and intra-modality differences and increase inter-identity variations by simultaneously pulling the features of the same identity close to their centers and pushing far away the centers of different identities. Extensive experiments on two public cross-modality datasets (SYSU-MM01 and RegDB) show that F^{2} CALNet is superior to the state-of-the-art approaches. Furthermore, on the SYSU-MM01 datasets, our model outperforms the baseline by 5.52% and 4.25% for the accuracy of rank1 and mAP, respectively.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3159805