Loading…

Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection

Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-br...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xie, Qian, Cheng, Ta-Ying, Zhong, Jia-Xing, Zhou, Kaichen, Markham, Andrew, Trigoni, Niki
Format:	Conference Proceeding
Language:	English
Subjects:	Algorithms and algorithms Boosting Computer architecture Computer vision Feature extraction formulations Fuses Image recognition and understanding Machine learning architectures Noise Pedestrians
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	653
container_issue
container_start_page	644
container_title
container_volume
creator	Xie, Qian Cheng, Ta-Ying Zhong, Jia-Xing Zhou, Kaichen Markham, Andrew Trigoni, Niki
description	Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.
doi_str_mv	10.1109/WACV57701.2024.00071
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10483978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10483978</ieee_id><sourcerecordid>10483978</sourcerecordid><originalsourceid>FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003</originalsourceid><addsrcrecordid>eNotj09LAzEUxKMgWGu_QQ_5AlvfS7JNnrdarRVa9OAfPJVkN4HIuls22UO_vQv2MAzMjxkYxuYIC0Sgu6_V-rPUGnAhQKgFAGi8YDPSZGQJEg0JuGQTsVSiIGnwmt2k9AMgCUlO2PeDP3VtzTdDil17z_ddbZuYT3xrm2aoYmvzmBfOJl_z_dDkmI6-yr1tzhUeup6_-dqn3Efb8kefRz6CW3YVbJP87OxT9rF5el9vi93r88t6tSsiIuUiSEe2VMaUWi1R-8qMIknBVkJXwRJU3oagpTPG1UuBgYJ2zig1hm78MWXz_93ovT8c-_hr-9MBQRlJ2sg_S0BU5g</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><source>IEEE Xplore All Conference Series</source><creator>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</creator><creatorcontrib>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</creatorcontrib><description>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</description><identifier>EISSN: 2642-9381</identifier><identifier>EISBN: 9798350318920</identifier><identifier>DOI: 10.1109/WACV57701.2024.00071</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithms ; and algorithms ; Boosting ; Computer architecture ; Computer vision ; Feature extraction ; formulations ; Fuses ; Image recognition and understanding ; Machine learning architectures ; Noise ; Pedestrians</subject><ispartof>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.644-653</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10483978$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10483978$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Xie, Qian</creatorcontrib><creatorcontrib>Cheng, Ta-Ying</creatorcontrib><creatorcontrib>Zhong, Jia-Xing</creatorcontrib><creatorcontrib>Zhou, Kaichen</creatorcontrib><creatorcontrib>Markham, Andrew</creatorcontrib><creatorcontrib>Trigoni, Niki</creatorcontrib><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><title>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title><addtitle>WACV</addtitle><description>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</description><subject>Algorithms</subject><subject>and algorithms</subject><subject>Boosting</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Feature extraction</subject><subject>formulations</subject><subject>Fuses</subject><subject>Image recognition and understanding</subject><subject>Machine learning architectures</subject><subject>Noise</subject><subject>Pedestrians</subject><issn>2642-9381</issn><isbn>9798350318920</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj09LAzEUxKMgWGu_QQ_5AlvfS7JNnrdarRVa9OAfPJVkN4HIuls22UO_vQv2MAzMjxkYxuYIC0Sgu6_V-rPUGnAhQKgFAGi8YDPSZGQJEg0JuGQTsVSiIGnwmt2k9AMgCUlO2PeDP3VtzTdDil17z_ddbZuYT3xrm2aoYmvzmBfOJl_z_dDkmI6-yr1tzhUeup6_-dqn3Efb8kefRz6CW3YVbJP87OxT9rF5el9vi93r88t6tSsiIuUiSEe2VMaUWi1R-8qMIknBVkJXwRJU3oagpTPG1UuBgYJ2zig1hm78MWXz_93ovT8c-_hr-9MBQRlJ2sg_S0BU5g</recordid><startdate>20240103</startdate><enddate>20240103</enddate><creator>Xie, Qian</creator><creator>Cheng, Ta-Ying</creator><creator>Zhong, Jia-Xing</creator><creator>Zhou, Kaichen</creator><creator>Markham, Andrew</creator><creator>Trigoni, Niki</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240103</creationdate><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><author>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>and algorithms</topic><topic>Boosting</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Feature extraction</topic><topic>formulations</topic><topic>Fuses</topic><topic>Image recognition and understanding</topic><topic>Machine learning architectures</topic><topic>Noise</topic><topic>Pedestrians</topic><toplevel>online_resources</toplevel><creatorcontrib>Xie, Qian</creatorcontrib><creatorcontrib>Cheng, Ta-Ying</creatorcontrib><creatorcontrib>Zhong, Jia-Xing</creatorcontrib><creatorcontrib>Zhou, Kaichen</creatorcontrib><creatorcontrib>Markham, Andrew</creatorcontrib><creatorcontrib>Trigoni, Niki</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xie, Qian</au><au>Cheng, Ta-Ying</au><au>Zhong, Jia-Xing</au><au>Zhou, Kaichen</au><au>Markham, Andrew</au><au>Trigoni, Niki</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</atitle><btitle>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</btitle><stitle>WACV</stitle><date>2024-01-03</date><risdate>2024</risdate><spage>644</spage><epage>653</epage><pages>644-653</pages><eissn>2642-9381</eissn><eisbn>9798350318920</eisbn><coden>IEEPAD</coden><abstract>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</abstract><pub>IEEE</pub><doi>10.1109/WACV57701.2024.00071</doi><tpages>10</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2642-9381
ispartof	2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.644-653
issn	2642-9381
language	eng
recordid	cdi_ieee_primary_10483978
source	IEEE Xplore All Conference Series
subjects	Algorithms and algorithms Boosting Computer architecture Computer vision Feature extraction formulations Fuses Image recognition and understanding Machine learning architectures Noise Pedestrians
title	Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T05%3A48%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Beyond%20Fusion:%20Modality%20Hallucination-based%20Multispectral%20Fusion%20for%20Pedestrian%20Detection&rft.btitle=2024%20IEEE/CVF%20Winter%20Conference%20on%20Applications%20of%20Computer%20Vision%20(WACV)&rft.au=Xie,%20Qian&rft.date=2024-01-03&rft.spage=644&rft.epage=653&rft.pages=644-653&rft.eissn=2642-9381&rft.coden=IEEPAD&rft_id=info:doi/10.1109/WACV57701.2024.00071&rft.eisbn=9798350318920&rft_dat=%3Cieee_CHZPO%3E10483978%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10483978&rfr_iscdi=true