Loading…

Toward Dual-View X-Ray Baggage Inspection: A Large-Scale Benchmark and Adaptive Hierarchical Cross Refinement for Prohibited Item Discovery

Dual-view baggage inspection has been widely applied in real-world scenarios, where orthogonal viewpoints are deployed to capture diverse and complementary information. Compared with single-view, it can effectively improve the identification performance when rotation and overlay hinder the viewabili...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on information forensics and security 2024, Vol.19, p.3866-3878
Main Authors: Ma, Bowen, Jia, Tong, Li, Mingyuan, Wu, Songsheng, Wang, Hao, Chen, Dongyue
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dual-view baggage inspection has been widely applied in real-world scenarios, where orthogonal viewpoints are deployed to capture diverse and complementary information. Compared with single-view, it can effectively improve the identification performance when rotation and overlay hinder the viewability of the objects. However, this topic has not been rigorously explored due to the scarcity of datasets. To overcome this limitation, we contribute the first fully public large-scale D ual- v iew X-ray dataset. Our dataset, named DvXray, contains 16,000 pairs, 32,000 X-ray images, in which 15 common classes of 5,496 prohibited items are manually labeled. Besides, we propose an approach named A daptive H ierarchical C ross R efinement (AHCR) to establish a strong baseline for prohibited item discovery in dual-view X-ray images. AHCR hypothesizes that each input pair is sampled from one mixture distribution, hence gathering the non-overlapping and position-aware cues along the shared axis and complementarily delivering to the other in a hierarchical structure to enrich the feature discriminability of the objects of interest from background overlaps. Upon this structure, we propose an adaptive control strategy and a confidence-weighted view fusion term to make it robust to difficult samples. Extensive experiments on DvXray show that AHCR not only brings significant classification gains over various backbones, such as recent Swin Transformer and ConvNeXt, but also exhibits an impressively better ability to localize objects. In addition, AHCR performs favorably against the counterparts and some recent multi-view learning approaches, moving a step closer towards potential application in practice. Dataset and code are available at https://github.com/Mbwslib/DvXray .
ISSN:1556-6013
1556-6021
DOI:10.1109/TIFS.2024.3372797