Loading…
MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding
At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on...
Saved in:
Published in: | IEEE transactions on intelligent transportation systems 2022-09, Vol.23 (9), p.15824-15840 |
---|---|
Main Authors: | , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383 |
---|---|
cites | cdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383 |
container_end_page | 15840 |
container_issue | 9 |
container_start_page | 15824 |
container_title | IEEE transactions on intelligent transportation systems |
container_volume | 23 |
creator | Peng, Kunyu Fei, Juncong Yang, Kailun Roitberg, Alina Zhang, Jiaming Bieder, Frank Heidenreich, Philipp Stiller, Christoph Stiefelhagen, Rainer |
description | At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision. |
doi_str_mv | 10.1109/TITS.2022.3145588 |
format | article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9700481</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9700481</ieee_id><sourcerecordid>2714901195</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</originalsourceid><addsrcrecordid>eNo9kF1LwzAYhYMoOKc_QLwJeN2Zz7Xxrmx-DDYE23klhLR5Ozq2diYZ4r83ZeLVORzOeeF9ELqlZEIpUQ_loiwmjDA24VRImWVnaESjJoTQ6fngmUgUkeQSXXm_jamQlI7Q5yovike8Ou5Cm-QhQBfavjM7XMDeRF9Hs9nH1Aw57hu8bOf5O56bYHDTOzyHzgMu-0Py0cI3XncWnA-ms223uUYXjdl5uPnTMVo_P5Wz12T59rKY5cukZoyHxCpTwTSVojIZrUnNMmEba6FKBVdEGWMaa3jNKCgFTFiuhJFQ2UxOjSQ842N0f7p7cP3XEXzQ2_7o4hdes5QKRShVMrboqVW73nsHjT64dm_cj6ZEDxD1AFEPEPUfxLi5O21aAPjvq5QQkVH-C0c1bYU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2714901195</pqid></control><display><type>article</type><title>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</title><source>IEEE Xplore (Online service)</source><creator>Peng, Kunyu ; Fei, Juncong ; Yang, Kailun ; Roitberg, Alina ; Zhang, Jiaming ; Bieder, Frank ; Heidenreich, Philipp ; Stiller, Christoph ; Stiefelhagen, Rainer</creator><creatorcontrib>Peng, Kunyu ; Fei, Juncong ; Yang, Kailun ; Roitberg, Alina ; Zhang, Jiaming ; Bieder, Frank ; Heidenreich, Philipp ; Stiller, Christoph ; Stiefelhagen, Rainer</creatorcontrib><description>At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2022.3145588</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>attention mechanism ; automated driving ; Autonomous cars ; Datasets ; Feature extraction ; Image segmentation ; intelligent vehicles ; Laser radar ; Lidar ; LiDAR data ; Object recognition ; Point cloud compression ; Semantic segmentation ; Semantics ; Task analysis ; Three-dimensional displays</subject><ispartof>IEEE transactions on intelligent transportation systems, 2022-09, Vol.23 (9), p.15824-15840</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</citedby><cites>FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</cites><orcidid>0000-0002-1090-667X ; 0000-0002-5419-9292 ; 0000-0003-4165-2075 ; 0000-0002-5906-7125 ; 0000-0003-4724-9164 ; 0000-0003-3471-328X ; 0000-0001-7213-0136</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9700481$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Peng, Kunyu</creatorcontrib><creatorcontrib>Fei, Juncong</creatorcontrib><creatorcontrib>Yang, Kailun</creatorcontrib><creatorcontrib>Roitberg, Alina</creatorcontrib><creatorcontrib>Zhang, Jiaming</creatorcontrib><creatorcontrib>Bieder, Frank</creatorcontrib><creatorcontrib>Heidenreich, Philipp</creatorcontrib><creatorcontrib>Stiller, Christoph</creatorcontrib><creatorcontrib>Stiefelhagen, Rainer</creatorcontrib><title>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.</description><subject>attention mechanism</subject><subject>automated driving</subject><subject>Autonomous cars</subject><subject>Datasets</subject><subject>Feature extraction</subject><subject>Image segmentation</subject><subject>intelligent vehicles</subject><subject>Laser radar</subject><subject>Lidar</subject><subject>LiDAR data</subject><subject>Object recognition</subject><subject>Point cloud compression</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Task analysis</subject><subject>Three-dimensional displays</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNo9kF1LwzAYhYMoOKc_QLwJeN2Zz7Xxrmx-DDYE23klhLR5Ozq2diYZ4r83ZeLVORzOeeF9ELqlZEIpUQ_loiwmjDA24VRImWVnaESjJoTQ6fngmUgUkeQSXXm_jamQlI7Q5yovike8Ou5Cm-QhQBfavjM7XMDeRF9Hs9nH1Aw57hu8bOf5O56bYHDTOzyHzgMu-0Py0cI3XncWnA-ms223uUYXjdl5uPnTMVo_P5Wz12T59rKY5cukZoyHxCpTwTSVojIZrUnNMmEba6FKBVdEGWMaa3jNKCgFTFiuhJFQ2UxOjSQ842N0f7p7cP3XEXzQ2_7o4hdes5QKRShVMrboqVW73nsHjT64dm_cj6ZEDxD1AFEPEPUfxLi5O21aAPjvq5QQkVH-C0c1bYU</recordid><startdate>20220901</startdate><enddate>20220901</enddate><creator>Peng, Kunyu</creator><creator>Fei, Juncong</creator><creator>Yang, Kailun</creator><creator>Roitberg, Alina</creator><creator>Zhang, Jiaming</creator><creator>Bieder, Frank</creator><creator>Heidenreich, Philipp</creator><creator>Stiller, Christoph</creator><creator>Stiefelhagen, Rainer</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-1090-667X</orcidid><orcidid>https://orcid.org/0000-0002-5419-9292</orcidid><orcidid>https://orcid.org/0000-0003-4165-2075</orcidid><orcidid>https://orcid.org/0000-0002-5906-7125</orcidid><orcidid>https://orcid.org/0000-0003-4724-9164</orcidid><orcidid>https://orcid.org/0000-0003-3471-328X</orcidid><orcidid>https://orcid.org/0000-0001-7213-0136</orcidid></search><sort><creationdate>20220901</creationdate><title>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</title><author>Peng, Kunyu ; Fei, Juncong ; Yang, Kailun ; Roitberg, Alina ; Zhang, Jiaming ; Bieder, Frank ; Heidenreich, Philipp ; Stiller, Christoph ; Stiefelhagen, Rainer</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>attention mechanism</topic><topic>automated driving</topic><topic>Autonomous cars</topic><topic>Datasets</topic><topic>Feature extraction</topic><topic>Image segmentation</topic><topic>intelligent vehicles</topic><topic>Laser radar</topic><topic>Lidar</topic><topic>LiDAR data</topic><topic>Object recognition</topic><topic>Point cloud compression</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Task analysis</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Peng, Kunyu</creatorcontrib><creatorcontrib>Fei, Juncong</creatorcontrib><creatorcontrib>Yang, Kailun</creatorcontrib><creatorcontrib>Roitberg, Alina</creatorcontrib><creatorcontrib>Zhang, Jiaming</creatorcontrib><creatorcontrib>Bieder, Frank</creatorcontrib><creatorcontrib>Heidenreich, Philipp</creatorcontrib><creatorcontrib>Stiller, Christoph</creatorcontrib><creatorcontrib>Stiefelhagen, Rainer</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Peng, Kunyu</au><au>Fei, Juncong</au><au>Yang, Kailun</au><au>Roitberg, Alina</au><au>Zhang, Jiaming</au><au>Bieder, Frank</au><au>Heidenreich, Philipp</au><au>Stiller, Christoph</au><au>Stiefelhagen, Rainer</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2022-09-01</date><risdate>2022</risdate><volume>23</volume><issue>9</issue><spage>15824</spage><epage>15840</epage><pages>15824-15840</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>At the heart of all automated driving systems is the ability to sense the surroundings, e.g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg. While most previous works focus on sparse segmentation of the LiDAR input, dense output masks provide self-driving cars with almost complete environment information. In this paper, we introduce MASS - a Multi-Attentional Semantic Segmentation model specifically built for dense top-view understanding of the driving scenes. Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks: (1) a keypoint-driven graph attention, (2) an LSTM-based attention computed from a vector embedding of the spatial input, and (3) a pillar-based attention, resulting in a dense 360° segmentation mask. With extensive experiments on both, SemanticKITTI and nuScenes-LidarSeg, we quantitatively demonstrate the effectiveness of our model, outperforming the state of the art by 19.0% on SemanticKITTI and reaching 30.4% in mIoU on nuScenes-LidarSeg, where MASS is the first work addressing the dense segmentation task. Furthermore, our multi-attention model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset, showcasing its high generalizability to other tasks related to 3D vision.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2022.3145588</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-1090-667X</orcidid><orcidid>https://orcid.org/0000-0002-5419-9292</orcidid><orcidid>https://orcid.org/0000-0003-4165-2075</orcidid><orcidid>https://orcid.org/0000-0002-5906-7125</orcidid><orcidid>https://orcid.org/0000-0003-4724-9164</orcidid><orcidid>https://orcid.org/0000-0003-3471-328X</orcidid><orcidid>https://orcid.org/0000-0001-7213-0136</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1524-9050 |
ispartof | IEEE transactions on intelligent transportation systems, 2022-09, Vol.23 (9), p.15824-15840 |
issn | 1524-9050 1558-0016 |
language | eng |
recordid | cdi_ieee_primary_9700481 |
source | IEEE Xplore (Online service) |
subjects | attention mechanism automated driving Autonomous cars Datasets Feature extraction Image segmentation intelligent vehicles Laser radar Lidar LiDAR data Object recognition Point cloud compression Semantic segmentation Semantics Task analysis Three-dimensional displays |
title | MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T04%3A54%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MASS:%20Multi-Attentional%20Semantic%20Segmentation%20of%20LiDAR%20Data%20for%20Dense%20Top-View%20Understanding&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Peng,%20Kunyu&rft.date=2022-09-01&rft.volume=23&rft.issue=9&rft.spage=15824&rft.epage=15840&rft.pages=15824-15840&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2022.3145588&rft_dat=%3Cproquest_ieee_%3E2714901195%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c223t-d9abe6754ba81c0c284dfddeb743909aaafda3c21e99e24d394a5ebd856a50383%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2714901195&rft_id=info:pmid/&rft_ieee_id=9700481&rfr_iscdi=true |