Loading…

MI3C: Mining intra- and inter-image context for person search

Person search aims to localize the queried person from a gallery of uncropped, realistic images. Unlike re-identification (Re-ID), person search deals with the entire scene image containing rich and diverse visual context information. However, existing works mainly focus on the person’s appearance w...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2024-04, Vol.148, Article 110169
Main Authors: Tang, Zongheng, Gao, Yulu, Hui, Tianrui, Peng, Fengguang, Liu, Si
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Person search aims to localize the queried person from a gallery of uncropped, realistic images. Unlike re-identification (Re-ID), person search deals with the entire scene image containing rich and diverse visual context information. However, existing works mainly focus on the person’s appearance while ignoring other essential intra- and inter-image context information. To comprehensively leverage the intra- and inter-image context, we propose a unified framework termed MI3C including the Intra-image Multi-View Context network (IMVC) and the Inter-image Group Context Ranking algorithm (IGCR). Concretely, the IMVC integrates the features from the scene, surrounding, instance, and part views collaboratively to generate the final ID feature for person search. Furthermore, the IGCR algorithm employs group matching results between query and gallery image pairs to measure the holistic image matching similarity, which is adopted as part of the sorting metric to yield a more robust ranking among the whole gallery. Extensive experiments on two popular person search benchmarks demonstrate that by mining intra- and inter-image context, our method outperforms previous state-of-the-art methods by conspicuous margins. Specifically, we achieve 96.7% mAP and 97.1% top-1 accuracy on the CUHK-SYSU dataset, 55.6% mAP, and 90.8% top-1 accuracy on the PRW dataset. •We propose a unified framework termed MI3C, which tackles person search from a more comprehensive perspective by mining both intra- and inter-image context.•We propose an Intra-image Multi-View Context (IMVC) network in MI3C, which contains the scene, surrounding, instance, and part branches to sufficiently extract intra-image context from multiple views and collaboratively integrate them for finer query-gallery matching.•We also propose an Inter-image Group Context Ranking (IGCR) algorithm in MI3C, which exploits group matching similarities as inter-image context to measure the holistic image matching similarity, yielding a more robust ranking among the whole gallery.•Extensive experiments on two popular person search benchmarks show that our method outperforms previous state-of-the-art methods by conspicuous margins. Specifically, for mAP and top-1 accuracy metrics, we achieve 96.7%/97.1% on the CUHK-SYSU dataset and 55.6%/90.8% on the PRW dataset.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.110169