Loading…

Probabilistic Trajectory Prediction of Vulnerable Road User Using Multimodal Inputs

Accurately predicting the actions of vulnerable road users (VRUs) is crucial for improving traffic flow and enhancing VRU safety. The unpredictable nature of VRU trajectories poses a significant challenge. To address this, we introduce the Probabilistic Multimodal Trajectory Prediction Network (PMTP...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on intelligent transportation systems 2024-12, p.1-11
Main Authors:	Hu, Chuan, Niu, Ruochen, Lin, Yiwei, Yang, Biao, Chen, Hao, Zhao, Baixuan, Zhang, Xi
Format:	Article
Language:	English
Subjects:	Accuracy autonomous vehicle Autonomous vehicles Data mining Decoding Feature extraction multi-modal prediction multi-task learning Pedestrians Predictive models Roads Trajectory Trajectory prediction Transformers
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	11
container_issue
container_start_page	1
container_title	IEEE transactions on intelligent transportation systems
container_volume
creator	Hu, Chuan Niu, Ruochen Lin, Yiwei Yang, Biao Chen, Hao Zhao, Baixuan Zhang, Xi
description	Accurately predicting the actions of vulnerable road users (VRUs) is crucial for improving traffic flow and enhancing VRU safety. The unpredictable nature of VRU trajectories poses a significant challenge. To address this, we introduce the Probabilistic Multimodal Trajectory Prediction Network (PMTPN), which effectively forecasts multimodal trajectories and their corresponding probabilities by utilizing a multitask learning framework that integrates trajectory and probability predictions. The network processes diverse input modalities, including bounding boxes, pedestrian pose, and ego-vehicle motion information. We enhance prediction performance by employing specialized encoders to extract distinct features from these inputs and a fusion module to integrate the data efficiently. To manage the variability in pedestrian actions, our model incorporates learnable motion queries that serve as reference points for predicting various potential outcomes. These queries are iteratively refined through attention operations with historical context in a multi-layer decoder. Additionally, a multi-gate mixture-of-experts (MMoE) module within the decoder helps mitigate the challenges of multitask learning. Our method significantly enhances trajectory prediction accuracy and provides probabilities for each predicted trajectory, demonstrating state-of-the-art results on the JAAD and PIE datasets.
doi_str_mv	10.1109/TITS.2024.3503683
format	article
fullrecord	<record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10778100</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10778100</ieee_id><sourcerecordid>10_1109_TITS_2024_3503683</sourcerecordid><originalsourceid>FETCH-LOGICAL-c148t-19f24582a8a36f53f3c1091964ea3e7d664b872883cd7b63028606e5fc7a2a0b3</originalsourceid><addsrcrecordid>eNpNkM1qwzAQhEVpoWnaByj0oBdwuvq1fCyhP4GUhsbp1ciyVBQcK0j2IW9fm-TQy84yzCzsh9AjgQUhUDyXq3K7oED5gglgUrErNCNCqAyAyOtppzwrQMAtuktpP7pcEDJD200Mta5961PvDS6j3lvTh3jCm2gbb3ofOhwc_hnazkZdtxZ_B93gXbJxHL77xZ9D2_tDaHSLV91x6NM9unG6TfbhonO0e3stlx_Z-ut9tXxZZ4Zw1WekcJQLRbXSTDrBHDPjJ6SQ3Gpm80ZKXqucKsVMk9eSAVUSpBXO5JpqqNkckfNdE0NK0brqGP1Bx1NFoJqoVBOVaqJSXaiMnadzx1tr_-XzXBEA9gfLwF6t</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Probabilistic Trajectory Prediction of Vulnerable Road User Using Multimodal Inputs</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Hu, Chuan ; Niu, Ruochen ; Lin, Yiwei ; Yang, Biao ; Chen, Hao ; Zhao, Baixuan ; Zhang, Xi</creator><creatorcontrib>Hu, Chuan ; Niu, Ruochen ; Lin, Yiwei ; Yang, Biao ; Chen, Hao ; Zhao, Baixuan ; Zhang, Xi</creatorcontrib><description>Accurately predicting the actions of vulnerable road users (VRUs) is crucial for improving traffic flow and enhancing VRU safety. The unpredictable nature of VRU trajectories poses a significant challenge. To address this, we introduce the Probabilistic Multimodal Trajectory Prediction Network (PMTPN), which effectively forecasts multimodal trajectories and their corresponding probabilities by utilizing a multitask learning framework that integrates trajectory and probability predictions. The network processes diverse input modalities, including bounding boxes, pedestrian pose, and ego-vehicle motion information. We enhance prediction performance by employing specialized encoders to extract distinct features from these inputs and a fusion module to integrate the data efficiently. To manage the variability in pedestrian actions, our model incorporates learnable motion queries that serve as reference points for predicting various potential outcomes. These queries are iteratively refined through attention operations with historical context in a multi-layer decoder. Additionally, a multi-gate mixture-of-experts (MMoE) module within the decoder helps mitigate the challenges of multitask learning. Our method significantly enhances trajectory prediction accuracy and provides probabilities for each predicted trajectory, demonstrating state-of-the-art results on the JAAD and PIE datasets.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2024.3503683</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; autonomous vehicle ; Autonomous vehicles ; Data mining ; Decoding ; Feature extraction ; multi-modal prediction ; multi-task learning ; Pedestrians ; Predictive models ; Roads ; Trajectory ; Trajectory prediction ; Transformers</subject><ispartof>IEEE transactions on intelligent transportation systems, 2024-12, p.1-11</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>yb6864171@cczu.edu.cn ; chuan.hu@sjtu.edu.cn ; bxzhao7@sjtu.edu.cn ; braver1980@sjtu.edu.cn ; braver1989@usst.edu.cn</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10778100$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Hu, Chuan</creatorcontrib><creatorcontrib>Niu, Ruochen</creatorcontrib><creatorcontrib>Lin, Yiwei</creatorcontrib><creatorcontrib>Yang, Biao</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Zhao, Baixuan</creatorcontrib><creatorcontrib>Zhang, Xi</creatorcontrib><title>Probabilistic Trajectory Prediction of Vulnerable Road User Using Multimodal Inputs</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>Accurately predicting the actions of vulnerable road users (VRUs) is crucial for improving traffic flow and enhancing VRU safety. The unpredictable nature of VRU trajectories poses a significant challenge. To address this, we introduce the Probabilistic Multimodal Trajectory Prediction Network (PMTPN), which effectively forecasts multimodal trajectories and their corresponding probabilities by utilizing a multitask learning framework that integrates trajectory and probability predictions. The network processes diverse input modalities, including bounding boxes, pedestrian pose, and ego-vehicle motion information. We enhance prediction performance by employing specialized encoders to extract distinct features from these inputs and a fusion module to integrate the data efficiently. To manage the variability in pedestrian actions, our model incorporates learnable motion queries that serve as reference points for predicting various potential outcomes. These queries are iteratively refined through attention operations with historical context in a multi-layer decoder. Additionally, a multi-gate mixture-of-experts (MMoE) module within the decoder helps mitigate the challenges of multitask learning. Our method significantly enhances trajectory prediction accuracy and provides probabilities for each predicted trajectory, demonstrating state-of-the-art results on the JAAD and PIE datasets.</description><subject>Accuracy</subject><subject>autonomous vehicle</subject><subject>Autonomous vehicles</subject><subject>Data mining</subject><subject>Decoding</subject><subject>Feature extraction</subject><subject>multi-modal prediction</subject><subject>multi-task learning</subject><subject>Pedestrians</subject><subject>Predictive models</subject><subject>Roads</subject><subject>Trajectory</subject><subject>Trajectory prediction</subject><subject>Transformers</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkM1qwzAQhEVpoWnaByj0oBdwuvq1fCyhP4GUhsbp1ciyVBQcK0j2IW9fm-TQy84yzCzsh9AjgQUhUDyXq3K7oED5gglgUrErNCNCqAyAyOtppzwrQMAtuktpP7pcEDJD200Mta5961PvDS6j3lvTh3jCm2gbb3ofOhwc_hnazkZdtxZ_B93gXbJxHL77xZ9D2_tDaHSLV91x6NM9unG6TfbhonO0e3stlx_Z-ut9tXxZZ4Zw1WekcJQLRbXSTDrBHDPjJ6SQ3Gpm80ZKXqucKsVMk9eSAVUSpBXO5JpqqNkckfNdE0NK0brqGP1Bx1NFoJqoVBOVaqJSXaiMnadzx1tr_-XzXBEA9gfLwF6t</recordid><startdate>20241204</startdate><enddate>20241204</enddate><creator>Hu, Chuan</creator><creator>Niu, Ruochen</creator><creator>Lin, Yiwei</creator><creator>Yang, Biao</creator><creator>Chen, Hao</creator><creator>Zhao, Baixuan</creator><creator>Zhang, Xi</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/yb6864171@cczu.edu.cn</orcidid><orcidid>https://orcid.org/chuan.hu@sjtu.edu.cn</orcidid><orcidid>https://orcid.org/bxzhao7@sjtu.edu.cn</orcidid><orcidid>https://orcid.org/braver1980@sjtu.edu.cn</orcidid><orcidid>https://orcid.org/braver1989@usst.edu.cn</orcidid></search><sort><creationdate>20241204</creationdate><title>Probabilistic Trajectory Prediction of Vulnerable Road User Using Multimodal Inputs</title><author>Hu, Chuan ; Niu, Ruochen ; Lin, Yiwei ; Yang, Biao ; Chen, Hao ; Zhao, Baixuan ; Zhang, Xi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c148t-19f24582a8a36f53f3c1091964ea3e7d664b872883cd7b63028606e5fc7a2a0b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>autonomous vehicle</topic><topic>Autonomous vehicles</topic><topic>Data mining</topic><topic>Decoding</topic><topic>Feature extraction</topic><topic>multi-modal prediction</topic><topic>multi-task learning</topic><topic>Pedestrians</topic><topic>Predictive models</topic><topic>Roads</topic><topic>Trajectory</topic><topic>Trajectory prediction</topic><topic>Transformers</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Chuan</creatorcontrib><creatorcontrib>Niu, Ruochen</creatorcontrib><creatorcontrib>Lin, Yiwei</creatorcontrib><creatorcontrib>Yang, Biao</creatorcontrib><creatorcontrib>Chen, Hao</creatorcontrib><creatorcontrib>Zhao, Baixuan</creatorcontrib><creatorcontrib>Zhang, Xi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hu, Chuan</au><au>Niu, Ruochen</au><au>Lin, Yiwei</au><au>Yang, Biao</au><au>Chen, Hao</au><au>Zhao, Baixuan</au><au>Zhang, Xi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Probabilistic Trajectory Prediction of Vulnerable Road User Using Multimodal Inputs</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2024-12-04</date><risdate>2024</risdate><spage>1</spage><epage>11</epage><pages>1-11</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>Accurately predicting the actions of vulnerable road users (VRUs) is crucial for improving traffic flow and enhancing VRU safety. The unpredictable nature of VRU trajectories poses a significant challenge. To address this, we introduce the Probabilistic Multimodal Trajectory Prediction Network (PMTPN), which effectively forecasts multimodal trajectories and their corresponding probabilities by utilizing a multitask learning framework that integrates trajectory and probability predictions. The network processes diverse input modalities, including bounding boxes, pedestrian pose, and ego-vehicle motion information. We enhance prediction performance by employing specialized encoders to extract distinct features from these inputs and a fusion module to integrate the data efficiently. To manage the variability in pedestrian actions, our model incorporates learnable motion queries that serve as reference points for predicting various potential outcomes. These queries are iteratively refined through attention operations with historical context in a multi-layer decoder. Additionally, a multi-gate mixture-of-experts (MMoE) module within the decoder helps mitigate the challenges of multitask learning. Our method significantly enhances trajectory prediction accuracy and provides probabilities for each predicted trajectory, demonstrating state-of-the-art results on the JAAD and PIE datasets.</abstract><pub>IEEE</pub><doi>10.1109/TITS.2024.3503683</doi><tpages>11</tpages><orcidid>https://orcid.org/yb6864171@cczu.edu.cn</orcidid><orcidid>https://orcid.org/chuan.hu@sjtu.edu.cn</orcidid><orcidid>https://orcid.org/bxzhao7@sjtu.edu.cn</orcidid><orcidid>https://orcid.org/braver1980@sjtu.edu.cn</orcidid><orcidid>https://orcid.org/braver1989@usst.edu.cn</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1524-9050
ispartof	IEEE transactions on intelligent transportation systems, 2024-12, p.1-11
issn	1524-9050 1558-0016
language	eng
recordid	cdi_ieee_primary_10778100
source	IEEE Electronic Library (IEL) Journals
subjects	Accuracy autonomous vehicle Autonomous vehicles Data mining Decoding Feature extraction multi-modal prediction multi-task learning Pedestrians Predictive models Roads Trajectory Trajectory prediction Transformers
title	Probabilistic Trajectory Prediction of Vulnerable Road User Using Multimodal Inputs
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T15%3A26%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Probabilistic%20Trajectory%20Prediction%20of%20Vulnerable%20Road%20User%20Using%20Multimodal%20Inputs&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Hu,%20Chuan&rft.date=2024-12-04&rft.spage=1&rft.epage=11&rft.pages=1-11&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2024.3503683&rft_dat=%3Ccrossref_ieee_%3E10_1109_TITS_2024_3503683%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c148t-19f24582a8a36f53f3c1091964ea3e7d664b872883cd7b63028606e5fc7a2a0b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10778100&rfr_iscdi=true