Loading…
Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection
Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-br...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 653 |
container_issue | |
container_start_page | 644 |
container_title | |
container_volume | |
creator | Xie, Qian Cheng, Ta-Ying Zhong, Jia-Xing Zhou, Kaichen Markham, Andrew Trigoni, Niki |
description | Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods. |
doi_str_mv | 10.1109/WACV57701.2024.00071 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10483978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10483978</ieee_id><sourcerecordid>10483978</sourcerecordid><originalsourceid>FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003</originalsourceid><addsrcrecordid>eNotj09LAzEUxKMgWGu_QQ_5AlvfS7JNnrdarRVa9OAfPJVkN4HIuls22UO_vQv2MAzMjxkYxuYIC0Sgu6_V-rPUGnAhQKgFAGi8YDPSZGQJEg0JuGQTsVSiIGnwmt2k9AMgCUlO2PeDP3VtzTdDil17z_ddbZuYT3xrm2aoYmvzmBfOJl_z_dDkmI6-yr1tzhUeup6_-dqn3Efb8kefRz6CW3YVbJP87OxT9rF5el9vi93r88t6tSsiIuUiSEe2VMaUWi1R-8qMIknBVkJXwRJU3oagpTPG1UuBgYJ2zig1hm78MWXz_93ovT8c-_hr-9MBQRlJ2sg_S0BU5g</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><source>IEEE Xplore All Conference Series</source><creator>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</creator><creatorcontrib>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</creatorcontrib><description>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</description><identifier>EISSN: 2642-9381</identifier><identifier>EISBN: 9798350318920</identifier><identifier>DOI: 10.1109/WACV57701.2024.00071</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithms ; and algorithms ; Boosting ; Computer architecture ; Computer vision ; Feature extraction ; formulations ; Fuses ; Image recognition and understanding ; Machine learning architectures ; Noise ; Pedestrians</subject><ispartof>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.644-653</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10483978$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10483978$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Xie, Qian</creatorcontrib><creatorcontrib>Cheng, Ta-Ying</creatorcontrib><creatorcontrib>Zhong, Jia-Xing</creatorcontrib><creatorcontrib>Zhou, Kaichen</creatorcontrib><creatorcontrib>Markham, Andrew</creatorcontrib><creatorcontrib>Trigoni, Niki</creatorcontrib><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><title>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title><addtitle>WACV</addtitle><description>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</description><subject>Algorithms</subject><subject>and algorithms</subject><subject>Boosting</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Feature extraction</subject><subject>formulations</subject><subject>Fuses</subject><subject>Image recognition and understanding</subject><subject>Machine learning architectures</subject><subject>Noise</subject><subject>Pedestrians</subject><issn>2642-9381</issn><isbn>9798350318920</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj09LAzEUxKMgWGu_QQ_5AlvfS7JNnrdarRVa9OAfPJVkN4HIuls22UO_vQv2MAzMjxkYxuYIC0Sgu6_V-rPUGnAhQKgFAGi8YDPSZGQJEg0JuGQTsVSiIGnwmt2k9AMgCUlO2PeDP3VtzTdDil17z_ddbZuYT3xrm2aoYmvzmBfOJl_z_dDkmI6-yr1tzhUeup6_-dqn3Efb8kefRz6CW3YVbJP87OxT9rF5el9vi93r88t6tSsiIuUiSEe2VMaUWi1R-8qMIknBVkJXwRJU3oagpTPG1UuBgYJ2zig1hm78MWXz_93ovT8c-_hr-9MBQRlJ2sg_S0BU5g</recordid><startdate>20240103</startdate><enddate>20240103</enddate><creator>Xie, Qian</creator><creator>Cheng, Ta-Ying</creator><creator>Zhong, Jia-Xing</creator><creator>Zhou, Kaichen</creator><creator>Markham, Andrew</creator><creator>Trigoni, Niki</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240103</creationdate><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><author>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>and algorithms</topic><topic>Boosting</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Feature extraction</topic><topic>formulations</topic><topic>Fuses</topic><topic>Image recognition and understanding</topic><topic>Machine learning architectures</topic><topic>Noise</topic><topic>Pedestrians</topic><toplevel>online_resources</toplevel><creatorcontrib>Xie, Qian</creatorcontrib><creatorcontrib>Cheng, Ta-Ying</creatorcontrib><creatorcontrib>Zhong, Jia-Xing</creatorcontrib><creatorcontrib>Zhou, Kaichen</creatorcontrib><creatorcontrib>Markham, Andrew</creatorcontrib><creatorcontrib>Trigoni, Niki</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xie, Qian</au><au>Cheng, Ta-Ying</au><au>Zhong, Jia-Xing</au><au>Zhou, Kaichen</au><au>Markham, Andrew</au><au>Trigoni, Niki</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</atitle><btitle>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</btitle><stitle>WACV</stitle><date>2024-01-03</date><risdate>2024</risdate><spage>644</spage><epage>653</epage><pages>644-653</pages><eissn>2642-9381</eissn><eisbn>9798350318920</eisbn><coden>IEEPAD</coden><abstract>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</abstract><pub>IEEE</pub><doi>10.1109/WACV57701.2024.00071</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 2642-9381 |
ispartof | 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.644-653 |
issn | 2642-9381 |
language | eng |
recordid | cdi_ieee_primary_10483978 |
source | IEEE Xplore All Conference Series |
subjects | Algorithms and algorithms Boosting Computer architecture Computer vision Feature extraction formulations Fuses Image recognition and understanding Machine learning architectures Noise Pedestrians |
title | Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T05%3A48%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Beyond%20Fusion:%20Modality%20Hallucination-based%20Multispectral%20Fusion%20for%20Pedestrian%20Detection&rft.btitle=2024%20IEEE/CVF%20Winter%20Conference%20on%20Applications%20of%20Computer%20Vision%20(WACV)&rft.au=Xie,%20Qian&rft.date=2024-01-03&rft.spage=644&rft.epage=653&rft.pages=644-653&rft.eissn=2642-9381&rft.coden=IEEPAD&rft_id=info:doi/10.1109/WACV57701.2024.00071&rft.eisbn=9798350318920&rft_dat=%3Cieee_CHZPO%3E10483978%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10483978&rfr_iscdi=true |