Loading…

MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding

At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on intelligent transportation systems 2022-09, Vol.23 (9), p.15824-15840
Main Authors: Peng, Kunyu, Fei, Juncong, Yang, Kailun, Roitberg, Alina, Zhang, Jiaming, Bieder, Frank, Heidenreich, Philipp, Stiller, Christoph, Stiefelhagen, Rainer
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383
cites cdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383
container_end_page 15840
container_issue 9
container_start_page 15824
container_title IEEE transactions on intelligent transportation systems
container_volume 23
creator Peng, Kunyu
Fei, Juncong
Yang, Kailun
Roitberg, Alina
Zhang, Jiaming
Bieder, Frank
Heidenreich, Philipp
Stiller, Christoph
Stiefelhagen, Rainer
description At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.
doi_str_mv 10.1109/TITS.2022.3145588
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9700481</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9700481</ieee_id><sourcerecordid>2714901195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</originalsourceid><addsrcrecordid>eNo9kF1LwzAYhYMoOKc_QLwJeN2Zz7Xxrmx-DDYE23klhLR5Ozq2diYZ4r83ZeLVORzOeeF9ELqlZEIpUQ_loiwmjDA24VRImWVnaESjJoTQ6fngmUgUkeQSXXm_jamQlI7Q5yovike8Ou5Cm-QhQBfavjM7XMDeRF9Hs9nH1Aw57hu8bOf5O56bYHDTOzyHzgMu-0Py0cI3XncWnA-ms223uUYXjdl5uPnTMVo_P5Wz12T59rKY5cukZoyHxCpTwTSVojIZrUnNMmEba6FKBVdEGWMaa3jNKCgFTFiuhJFQ2UxOjSQ842N0f7p7cP3XEXzQ2_7o4hdes5QKRShVMrboqVW73nsHjT64dm_cj6ZEDxD1AFEPEPUfxLi5O21aAPjvq5QQkVH-C0c1bYU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2714901195</pqid></control><display><type>article</type><title>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</title><source>IEEE Xplore (Online service)</source><creator>Peng, Kunyu ; Fei, Juncong ; Yang, Kailun ; Roitberg, Alina ; Zhang, Jiaming ; Bieder, Frank ; Heidenreich, Philipp ; Stiller, Christoph ; Stiefelhagen, Rainer</creator><creatorcontrib>Peng, Kunyu ; Fei, Juncong ; Yang, Kailun ; Roitberg, Alina ; Zhang, Jiaming ; Bieder, Frank ; Heidenreich, Philipp ; Stiller, Christoph ; Stiefelhagen, Rainer</creatorcontrib><description>At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2022.3145588</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>attention mechanism ; automated driving ; Autonomous cars ; Datasets ; Feature extraction ; Image segmentation ; intelligent vehicles ; Laser radar ; Lidar ; LiDAR data ; Object recognition ; Point cloud compression ; Semantic segmentation ; Semantics ; Task analysis ; Three-dimensional displays</subject><ispartof>IEEE transactions on intelligent transportation systems, 2022-09, Vol.23 (9), p.15824-15840</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</citedby><cites>FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</cites><orcidid>0000-0002-1090-667X ; 0000-0002-5419-9292 ; 0000-0003-4165-2075 ; 0000-0002-5906-7125 ; 0000-0003-4724-9164 ; 0000-0003-3471-328X ; 0000-0001-7213-0136</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9700481$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Peng, Kunyu</creatorcontrib><creatorcontrib>Fei, Juncong</creatorcontrib><creatorcontrib>Yang, Kailun</creatorcontrib><creatorcontrib>Roitberg, Alina</creatorcontrib><creatorcontrib>Zhang, Jiaming</creatorcontrib><creatorcontrib>Bieder, Frank</creatorcontrib><creatorcontrib>Heidenreich, Philipp</creatorcontrib><creatorcontrib>Stiller, Christoph</creatorcontrib><creatorcontrib>Stiefelhagen, Rainer</creatorcontrib><title>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.</description><subject>attention mechanism</subject><subject>automated driving</subject><subject>Autonomous cars</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Image segmentation</subject><subject>intelligent vehicles</subject><subject>Laser radar</subject><subject>Lidar</subject><subject>LiDAR data</subject><subject>Object recognition</subject><subject>Point cloud compression</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Task analysis</subject><subject>Three-dimensional displays</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo9kF1LwzAYhYMoOKc_QLwJeN2Zz7Xxrmx-DDYE23klhLR5Ozq2diYZ4r83ZeLVORzOeeF9ELqlZEIpUQ_loiwmjDA24VRImWVnaESjJoTQ6fngmUgUkeQSXXm_jamQlI7Q5yovike8Ou5Cm-QhQBfavjM7XMDeRF9Hs9nH1Aw57hu8bOf5O56bYHDTOzyHzgMu-0Py0cI3XncWnA-ms223uUYXjdl5uPnTMVo_P5Wz12T59rKY5cukZoyHxCpTwTSVojIZrUnNMmEba6FKBVdEGWMaa3jNKCgFTFiuhJFQ2UxOjSQ842N0f7p7cP3XEXzQ2_7o4hdes5QKRShVMrboqVW73nsHjT64dm_cj6ZEDxD1AFEPEPUfxLi5O21aAPjvq5QQkVH-C0c1bYU</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Peng, Kunyu</creator><creator>Fei, Juncong</creator><creator>Yang, Kailun</creator><creator>Roitberg, Alina</creator><creator>Zhang, Jiaming</creator><creator>Bieder, Frank</creator><creator>Heidenreich, Philipp</creator><creator>Stiller, Christoph</creator><creator>Stiefelhagen, Rainer</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1090-667X</orcidid><orcidid>https://orcid.org/0000-0002-5419-9292</orcidid><orcidid>https://orcid.org/0000-0003-4165-2075</orcidid><orcidid>https://orcid.org/0000-0002-5906-7125</orcidid><orcidid>https://orcid.org/0000-0003-4724-9164</orcidid><orcidid>https://orcid.org/0000-0003-3471-328X</orcidid><orcidid>https://orcid.org/0000-0001-7213-0136</orcidid></search><sort><creationdate>20220901</creationdate><title>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</title><author>Peng, Kunyu ; Fei, Juncong ; Yang, Kailun ; Roitberg, Alina ; Zhang, Jiaming ; Bieder, Frank ; Heidenreich, Philipp ; Stiller, Christoph ; Stiefelhagen, Rainer</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>attention mechanism</topic><topic>automated driving</topic><topic>Autonomous cars</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Image segmentation</topic><topic>intelligent vehicles</topic><topic>Laser radar</topic><topic>Lidar</topic><topic>LiDAR data</topic><topic>Object recognition</topic><topic>Point cloud compression</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Task analysis</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Peng, Kunyu</creatorcontrib><creatorcontrib>Fei, Juncong</creatorcontrib><creatorcontrib>Yang, Kailun</creatorcontrib><creatorcontrib>Roitberg, Alina</creatorcontrib><creatorcontrib>Zhang, Jiaming</creatorcontrib><creatorcontrib>Bieder, Frank</creatorcontrib><creatorcontrib>Heidenreich, Philipp</creatorcontrib><creatorcontrib>Stiller, Christoph</creatorcontrib><creatorcontrib>Stiefelhagen, Rainer</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Peng, Kunyu</au><au>Fei, Juncong</au><au>Yang, Kailun</au><au>Roitberg, Alina</au><au>Zhang, Jiaming</au><au>Bieder, Frank</au><au>Heidenreich, Philipp</au><au>Stiller, Christoph</au><au>Stiefelhagen, Rainer</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>23</volume><issue>9</issue><spage>15824</spage><epage>15840</epage><pages>15824-15840</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2022.3145588</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-1090-667X</orcidid><orcidid>https://orcid.org/0000-0002-5419-9292</orcidid><orcidid>https://orcid.org/0000-0003-4165-2075</orcidid><orcidid>https://orcid.org/0000-0002-5906-7125</orcidid><orcidid>https://orcid.org/0000-0003-4724-9164</orcidid><orcidid>https://orcid.org/0000-0003-3471-328X</orcidid><orcidid>https://orcid.org/0000-0001-7213-0136</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1524-9050
ispartof IEEE transactions on intelligent transportation systems, 2022-09, Vol.23 (9), p.15824-15840
issn 1524-9050
1558-0016
language eng
recordid cdi_ieee_primary_9700481
source IEEE Xplore (Online service)
subjects attention mechanism
automated driving
Autonomous cars
Datasets
Feature extraction
Image segmentation
intelligent vehicles
Laser radar
Lidar
LiDAR data
Object recognition
Point cloud compression
Semantic segmentation
Semantics
Task analysis
Three-dimensional displays
title MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T04%3A54%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MASS:%20Multi-Attentional%20Semantic%20Segmentation%20of%20LiDAR%20Data%20for%20Dense%20Top-View%20Understanding&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Peng,%20Kunyu&rft.date=2022-09-01&rft.volume=23&rft.issue=9&rft.spage=15824&rft.epage=15840&rft.pages=15824-15840&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2022.3145588&rft_dat=%3Cproquest_ieee_%3E2714901195%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2714901195&rft_id=info:pmid/&rft_ieee_id=9700481&rfr_iscdi=true