Loading…

Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection

Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-br...

Full description

Saved in:
Bibliographic Details
Main Authors: Xie, Qian, Cheng, Ta-Ying, Zhong, Jia-Xing, Zhou, Kaichen, Markham, Andrew, Trigoni, Niki
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 653
container_issue
container_start_page 644
container_title
container_volume
creator Xie, Qian
Cheng, Ta-Ying
Zhong, Jia-Xing
Zhou, Kaichen
Markham, Andrew
Trigoni, Niki
description Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.
doi_str_mv 10.1109/WACV57701.2024.00071
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10483978</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10483978</ieee_id><sourcerecordid>10483978</sourcerecordid><originalsourceid>FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003</originalsourceid><addsrcrecordid>eNotj09LAzEUxKMgWGu_QQ_5AlvfS7JNnrdarRVa9OAfPJVkN4HIuls22UO_vQv2MAzMjxkYxuYIC0Sgu6_V-rPUGnAhQKgFAGi8YDPSZGQJEg0JuGQTsVSiIGnwmt2k9AMgCUlO2PeDP3VtzTdDil17z_ddbZuYT3xrm2aoYmvzmBfOJl_z_dDkmI6-yr1tzhUeup6_-dqn3Efb8kefRz6CW3YVbJP87OxT9rF5el9vi93r88t6tSsiIuUiSEe2VMaUWi1R-8qMIknBVkJXwRJU3oagpTPG1UuBgYJ2zig1hm78MWXz_93ovT8c-_hr-9MBQRlJ2sg_S0BU5g</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><source>IEEE Xplore All Conference Series</source><creator>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</creator><creatorcontrib>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</creatorcontrib><description>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</description><identifier>EISSN: 2642-9381</identifier><identifier>EISBN: 9798350318920</identifier><identifier>DOI: 10.1109/WACV57701.2024.00071</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithms ; and algorithms ; Boosting ; Computer architecture ; Computer vision ; Feature extraction ; formulations ; Fuses ; Image recognition and understanding ; Machine learning architectures ; Noise ; Pedestrians</subject><ispartof>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.644-653</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10483978$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10483978$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Xie, Qian</creatorcontrib><creatorcontrib>Cheng, Ta-Ying</creatorcontrib><creatorcontrib>Zhong, Jia-Xing</creatorcontrib><creatorcontrib>Zhou, Kaichen</creatorcontrib><creatorcontrib>Markham, Andrew</creatorcontrib><creatorcontrib>Trigoni, Niki</creatorcontrib><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><title>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title><addtitle>WACV</addtitle><description>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</description><subject>Algorithms</subject><subject>and algorithms</subject><subject>Boosting</subject><subject>Computer architecture</subject><subject>Computer vision</subject><subject>Feature extraction</subject><subject>formulations</subject><subject>Fuses</subject><subject>Image recognition and understanding</subject><subject>Machine learning architectures</subject><subject>Noise</subject><subject>Pedestrians</subject><issn>2642-9381</issn><isbn>9798350318920</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj09LAzEUxKMgWGu_QQ_5AlvfS7JNnrdarRVa9OAfPJVkN4HIuls22UO_vQv2MAzMjxkYxuYIC0Sgu6_V-rPUGnAhQKgFAGi8YDPSZGQJEg0JuGQTsVSiIGnwmt2k9AMgCUlO2PeDP3VtzTdDil17z_ddbZuYT3xrm2aoYmvzmBfOJl_z_dDkmI6-yr1tzhUeup6_-dqn3Efb8kefRz6CW3YVbJP87OxT9rF5el9vi93r88t6tSsiIuUiSEe2VMaUWi1R-8qMIknBVkJXwRJU3oagpTPG1UuBgYJ2zig1hm78MWXz_93ovT8c-_hr-9MBQRlJ2sg_S0BU5g</recordid><startdate>20240103</startdate><enddate>20240103</enddate><creator>Xie, Qian</creator><creator>Cheng, Ta-Ying</creator><creator>Zhong, Jia-Xing</creator><creator>Zhou, Kaichen</creator><creator>Markham, Andrew</creator><creator>Trigoni, Niki</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240103</creationdate><title>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</title><author>Xie, Qian ; Cheng, Ta-Ying ; Zhong, Jia-Xing ; Zhou, Kaichen ; Markham, Andrew ; Trigoni, Niki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>and algorithms</topic><topic>Boosting</topic><topic>Computer architecture</topic><topic>Computer vision</topic><topic>Feature extraction</topic><topic>formulations</topic><topic>Fuses</topic><topic>Image recognition and understanding</topic><topic>Machine learning architectures</topic><topic>Noise</topic><topic>Pedestrians</topic><toplevel>online_resources</toplevel><creatorcontrib>Xie, Qian</creatorcontrib><creatorcontrib>Cheng, Ta-Ying</creatorcontrib><creatorcontrib>Zhong, Jia-Xing</creatorcontrib><creatorcontrib>Zhou, Kaichen</creatorcontrib><creatorcontrib>Markham, Andrew</creatorcontrib><creatorcontrib>Trigoni, Niki</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEL</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Xie, Qian</au><au>Cheng, Ta-Ying</au><au>Zhong, Jia-Xing</au><au>Zhou, Kaichen</au><au>Markham, Andrew</au><au>Trigoni, Niki</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection</atitle><btitle>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</btitle><stitle>WACV</stitle><date>2024-01-03</date><risdate>2024</risdate><spage>644</spage><epage>653</epage><pages>644-653</pages><eissn>2642-9381</eissn><eisbn>9798350318920</eisbn><coden>IEEPAD</coden><abstract>Pedestrian detection is a fundamental task for many downstream applications. Visible and thermal images, as the two most important data types, are usually used to detect pedestrians under various environmental conditions. Many state-of-the-art works have been proposed to use two-stream (i.e., two-branch) architectures to combine visible and thermal information to improve detection performance. However, conventional visible-thermal fusion-based methods have no ability to obtain useful information from the visible branch under poor visibility conditions. The visible branch could even sometimes bring noise into the combined features. In this paper, we present a novel thermal and visible fusion architecture for pedestrian detection. Instead of simply using two branches to separately extract thermal and visible features and then fusing them, we introduce a hallucination branch to learn the mapping from the thermal to the visible domain, forming a novel three-branch feature extraction module. We then adaptively fuse feature maps from all three branches (i.e., thermal, visible, and hallucination). With this new integrated hallucination branch, our network can still get relatively good visible feature maps under challenging low-visibility conditions, thus boosting the overall detection performance. Finally, we experimentally demonstrate the superiority of the proposed architecture over conventional fusion methods.</abstract><pub>IEEE</pub><doi>10.1109/WACV57701.2024.00071</doi><tpages>10</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2642-9381
ispartof 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.644-653
issn 2642-9381
language eng
recordid cdi_ieee_primary_10483978
source IEEE Xplore All Conference Series
subjects Algorithms
and algorithms
Boosting
Computer architecture
Computer vision
Feature extraction
formulations
Fuses
Image recognition and understanding
Machine learning architectures
Noise
Pedestrians
title Beyond Fusion: Modality Hallucination-based Multispectral Fusion for Pedestrian Detection
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T05%3A48%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Beyond%20Fusion:%20Modality%20Hallucination-based%20Multispectral%20Fusion%20for%20Pedestrian%20Detection&rft.btitle=2024%20IEEE/CVF%20Winter%20Conference%20on%20Applications%20of%20Computer%20Vision%20(WACV)&rft.au=Xie,%20Qian&rft.date=2024-01-03&rft.spage=644&rft.epage=653&rft.pages=644-653&rft.eissn=2642-9381&rft.coden=IEEPAD&rft_id=info:doi/10.1109/WACV57701.2024.00071&rft.eisbn=9798350318920&rft_dat=%3Cieee_CHZPO%3E10483978%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i119t-f3b9a5488574617ec87ec939fac27cfa90ceaff73b88bd621f9f7bb844ff7b003%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10483978&rfr_iscdi=true