Loading…
Multimodal Fusion Induced Attention Network for Industrial VOCs Detection
Industrial volatile organic compounds (VOCs) emissions and leakage have caused serious problems to the environment and public safety. Traditional VOCs monitoring systems require professionals to carry gas sensors into the emission area to collect VOCs, which might cause secondary hazards. VOCs infra...
Saved in:
Published in: | IEEE transactions on artificial intelligence 2024-12, Vol.5 (12), p.6385-6398 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Industrial volatile organic compounds (VOCs) emissions and leakage have caused serious problems to the environment and public safety. Traditional VOCs monitoring systems require professionals to carry gas sensors into the emission area to collect VOCs, which might cause secondary hazards. VOCs infrared (IR) imaging visual inspection technology is a convenient and low-cost method. However, current visual detection methods with VOCs IR imaging are limited due to blurred imaging and indeterminate gas shapes. Moreover, major works pay attention to only IR modality for VOCs emissions detection, which would neglect semantic expressions of VOCs. To this end, we propose a dual-stream fusion detection framework to deal with visible and IR features of VOCs. Additionally, a multimodal fusion induced attention (MFIA) module is designed to realize feature fusion across modalities. Specifically, MFIA uses the spatial attention fusion module (SAFM) to mine association among modalities in terms of spatial location and generates fused features by spatial location weighting. Then, the modality adapter (MA) and induced attention module (IAM) are proposed to weight latent VOCs regions in IR features, which alleviates the problem of noise interference and degradation of VOCs characterization caused by fusion. Finally, comprehensive experiments are carried out on the challenging VOCs dataset, and the mAP@0.5 and F1-score of the proposed model are 0.527 and 0.601, which outperforms the state-of-the-art methods by 3.3% and 3.4%, respectively. |
---|---|
ISSN: | 2691-4581 2691-4581 |
DOI: | 10.1109/TAI.2024.3436037 |