Loading…

Feature Fusion and Center Aggregation for Visible-Infrared Person Re-Identification

The visible-infrared pedestrian re- identification (VI Re-ID) task aims to match cross-modality pedestrian images with the same labels. Most current methods focus on mitigating the modality discrepancy by adopting a two-stream network and identity supervision. Based on current methods, we propose a...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2022, Vol.10, p.30949-30958
Main Authors: Wang, Xianju, Chen, Cuiqun, Zhu, Yong, Chen, Shuguang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The visible-infrared pedestrian re- identification (VI Re-ID) task aims to match cross-modality pedestrian images with the same labels. Most current methods focus on mitigating the modality discrepancy by adopting a two-stream network and identity supervision. Based on current methods, we propose a novel feature fusion and center aggregation learning network ( F^{2} CALNet) for cross-modality pedestrian re- identification. F^{2} CALNet focuses on learning modality-irrelevant features by simultaneously reducing inter-modality discrepancies and increasing the inter-identity variations in a single framework. Specifically, we first adopt a two-stream backbone network to extract modality-independent and modality-shared information. Then, we embed modality mitigation modules in a two-stream network to learn feature maps that are stripped of the modality information. Finally, we devise a feature fusion and center aggregation learning module, which first merges two different granularity features to learn distinguishing features, then, we organize two kinds of center-based loss functions to reduce the intra-identity inter- and intra-modality differences and increase inter-identity variations by simultaneously pulling the features of the same identity close to their centers and pushing far away the centers of different identities. Extensive experiments on two public cross-modality datasets (SYSU-MM01 and RegDB) show that F^{2} CALNet is superior to the state-of-the-art approaches. Furthermore, on the SYSU-MM01 datasets, our model outperforms the baseline by 5.52% and 4.25% for the accuracy of rank1 and mAP, respectively.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3159805