Loading…

Event-Free Moving Object Segmentation from Moving Ego Vehicle

Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-09
Main Authors: Zhou, Zhuyun, Wu, Zongwei, Danda Pani Paudel, Boutteau, Rémi, Yang, Fan, Luc Van Gool, Timofte, Radu, Ginhac, Dominique
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Zhou, Zhuyun
Wu, Zongwei
Danda Pani Paudel
Boutteau, Rémi
Yang, Fan
Luc Van Gool
Timofte, Radu
Ginhac, Dominique
description Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2808433436</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2808433436</sourcerecordid><originalsourceid>FETCH-proquest_journals_28084334363</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSwdS1LzSvRdStKTVXwzS_LzEtX8E_KSk0uUQhOTc8FSiWWZObnKaQV5efC5F3T8xXCUjMyk3NSeRhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXxWfmlRHlAq3sjCwMLE2NjE2MyYOFUAMnc3WQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2808433436</pqid></control><display><type>article</type><title>Event-Free Moving Object Segmentation from Moving Ego Vehicle</title><source>Publicly Available Content Database</source><creator>Zhou, Zhuyun ; Wu, Zongwei ; Danda Pani Paudel ; Boutteau, Rémi ; Yang, Fan ; Luc Van Gool ; Timofte, Radu ; Ginhac, Dominique</creator><creatorcontrib>Zhou, Zhuyun ; Wu, Zongwei ; Danda Pani Paudel ; Boutteau, Rémi ; Yang, Fan ; Luc Van Gool ; Timofte, Radu ; Ginhac, Dominique</creatorcontrib><description>Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Annotations ; Cameras ; Computer vision ; Datasets ; Frames (data processing) ; Image segmentation ; Object motion ; Temporal resolution</subject><ispartof>arXiv.org, 2024-09</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2808433436?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Zhou, Zhuyun</creatorcontrib><creatorcontrib>Wu, Zongwei</creatorcontrib><creatorcontrib>Danda Pani Paudel</creatorcontrib><creatorcontrib>Boutteau, Rémi</creatorcontrib><creatorcontrib>Yang, Fan</creatorcontrib><creatorcontrib>Luc Van Gool</creatorcontrib><creatorcontrib>Timofte, Radu</creatorcontrib><creatorcontrib>Ginhac, Dominique</creatorcontrib><title>Event-Free Moving Object Segmentation from Moving Ego Vehicle</title><title>arXiv.org</title><description>Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.</description><subject>Annotations</subject><subject>Cameras</subject><subject>Computer vision</subject><subject>Datasets</subject><subject>Frames (data processing)</subject><subject>Image segmentation</subject><subject>Object motion</subject><subject>Temporal resolution</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mSwdS1LzSvRdStKTVXwzS_LzEtX8E_KSk0uUQhOTc8FSiWWZObnKaQV5efC5F3T8xXCUjMyk3NSeRhY0xJzilN5oTQ3g7Kba4izh25BUX5haWpxSXxWfmlRHlAq3sjCwMLE2NjE2MyYOFUAMnc3WQ</recordid><startdate>20240925</startdate><enddate>20240925</enddate><creator>Zhou, Zhuyun</creator><creator>Wu, Zongwei</creator><creator>Danda Pani Paudel</creator><creator>Boutteau, Rémi</creator><creator>Yang, Fan</creator><creator>Luc Van Gool</creator><creator>Timofte, Radu</creator><creator>Ginhac, Dominique</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240925</creationdate><title>Event-Free Moving Object Segmentation from Moving Ego Vehicle</title><author>Zhou, Zhuyun ; Wu, Zongwei ; Danda Pani Paudel ; Boutteau, Rémi ; Yang, Fan ; Luc Van Gool ; Timofte, Radu ; Ginhac, Dominique</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28084334363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Cameras</topic><topic>Computer vision</topic><topic>Datasets</topic><topic>Frames (data processing)</topic><topic>Image segmentation</topic><topic>Object motion</topic><topic>Temporal resolution</topic><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Zhuyun</creatorcontrib><creatorcontrib>Wu, Zongwei</creatorcontrib><creatorcontrib>Danda Pani Paudel</creatorcontrib><creatorcontrib>Boutteau, Rémi</creatorcontrib><creatorcontrib>Yang, Fan</creatorcontrib><creatorcontrib>Luc Van Gool</creatorcontrib><creatorcontrib>Timofte, Radu</creatorcontrib><creatorcontrib>Ginhac, Dominique</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Zhuyun</au><au>Wu, Zongwei</au><au>Danda Pani Paudel</au><au>Boutteau, Rémi</au><au>Yang, Fan</au><au>Luc Van Gool</au><au>Timofte, Radu</au><au>Ginhac, Dominique</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Event-Free Moving Object Segmentation from Moving Ego Vehicle</atitle><jtitle>arXiv.org</jtitle><date>2024-09-25</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving, especially for sequences obtained from moving ego vehicles. Most segmentation methods leverage motion cues obtained from optical flow maps. However, since these methods are often based on optical flows that are pre-computed from successive RGB frames, this neglects the temporal consideration of events occurring within the inter-frame, consequently constraining its ability to discern objects exhibiting relative staticity but genuinely in motion. To address these limitations, we propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow. To foster research in this area, we first introduce a novel large-scale dataset called DSEC-MOS for moving object segmentation from moving ego vehicles, which is the first of its kind. For benchmarking, we select various mainstream methods and rigorously evaluate them on our dataset. Subsequently, we devise EmoFormer, a novel network able to exploit the event data. For this purpose, we fuse the event temporal prior with spatial semantic maps to distinguish genuinely moving objects from the static background, adding another level of dense supervision around our object of interest. Our proposed network relies only on event data for training but does not require event input during inference, making it directly comparable to frame-only methods in terms of efficiency and more widely usable in many application cases. The exhaustive comparison highlights a significant performance improvement of our method over all other methods. The source code and dataset are publicly available at: https://github.com/ZZY-Zhou/DSEC-MOS.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-09
issn 2331-8422
language eng
recordid cdi_proquest_journals_2808433436
source Publicly Available Content Database
subjects Annotations
Cameras
Computer vision
Datasets
Frames (data processing)
Image segmentation
Object motion
Temporal resolution
title Event-Free Moving Object Segmentation from Moving Ego Vehicle
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T05%3A09%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Event-Free%20Moving%20Object%20Segmentation%20from%20Moving%20Ego%20Vehicle&rft.jtitle=arXiv.org&rft.au=Zhou,%20Zhuyun&rft.date=2024-09-25&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2808433436%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28084334363%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2808433436&rft_id=info:pmid/&rfr_iscdi=true