Loading…

Parallel grid-based density peak clustering of big trajectory data

With the widespread adoption of data intensive applications such as navigation systems for mobile devices and unmanned vehicles, analyzing trajectory data has become a key research area. One of the main tasks is trajectory clustering, which consists of automatically grouping similar trajectories int...

Full description

Saved in:

Bibliographic Details
Published in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-12, Vol.52 (15), p.17042-17057
Main Authors:	Niu, Xinzheng, Zheng, Yunhong, Fournier-Viger, Philippe, Wang, Bing
Format:	Article
Language:	English
Subjects:	Algorithms Artificial Intelligence Clustering Computer Science Datasets Density Electronic devices Emerging topics in Applied Intelligence selected from IEA/AIE2020 Euclidean geometry Euclidean space Machines Manufacturing Measurement methods Mechanical Engineering Navigation systems Processes Similarity Trajectory analysis Unmanned vehicles
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3
cites	cdi_FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3
container_end_page	17057
container_issue	15
container_start_page	17042
container_title	Applied intelligence (Dordrecht, Netherlands)
container_volume	52
creator	Niu, Xinzheng Zheng, Yunhong Fournier-Viger, Philippe Wang, Bing
description	With the widespread adoption of data intensive applications such as navigation systems for mobile devices and unmanned vehicles, analyzing trajectory data has become a key research area. One of the main tasks is trajectory clustering, which consists of automatically grouping similar trajectories into clusters. To perform this task, Density Peak Clustering (DPC) is widely used due to its speed and small number of artificial parameters. However, a major problem is that its performance does not scale well for large datasets. To address this issue, this paper proposes an efficient parallel trajectory clustering algorithm, named Tra-PDPC (Trajectory-Parallel DPC). It is applied in three steps, namely trajectory division and partition, trajectory similarity calculation, and clustering. Those steps are all designed to run in a distributed fashion using the Spark programming model. For the first step, a scheme is proposed to divide sub-trajectories based on local grid area density. Then, a combined similarity measurement method based on Euclidean space and grid space is defined for sub-trajectories similarity calculation. Finally, a version of DPC is applied, which dramatically improves clustering speed. Experiments on multiple large realistic trajectory datasets have demonstrated that the proposed Tra-PDPC algorithm can considerably decrease runtime while providing a high accuracy.
doi_str_mv	10.1007/s10489-021-02757-w
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2737808084</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2737808084</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3</originalsourceid><addsrcrecordid>eNp9kEFLAzEQhYMoWKt_wFPAc3Rms002Ry1ahYIeFLyFbJItW9fumqSU_ntTV_AmwzBzeO8N8xFyiXCNAPImIpSVYlBgbjmTbHdEJjiTnMlSyWMyAVWUTAj1fkrOYlwDAOeAE3L3YoLpOt_RVWgdq030jjq_iW3a08GbD2q7bUw-tJsV7Rtatyuagll7m_qwp84kc05OGtNFf_E7p-Tt4f51_siWz4un-e2SWY4qMQVSesut54KjLMqituhKKQDdDFyNEqGqQaHg3ijRKBB1Yzg64XxT8bxOydWYO4T-a-tj0ut-Gzb5pC4klxXkKrOqGFU29DEG3-ghtJ8m7DWCPrDSIyudWekfVnqXTXw0xeHwqA9_0f-4vgEyMGx2</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2737808084</pqid></control><display><type>article</type><title>Parallel grid-based density peak clustering of big trajectory data</title><source>ABI/INFORM Collection</source><source>Springer Link</source><creator>Niu, Xinzheng ; Zheng, Yunhong ; Fournier-Viger, Philippe ; Wang, Bing</creator><creatorcontrib>Niu, Xinzheng ; Zheng, Yunhong ; Fournier-Viger, Philippe ; Wang, Bing</creatorcontrib><description>With the widespread adoption of data intensive applications such as navigation systems for mobile devices and unmanned vehicles, analyzing trajectory data has become a key research area. One of the main tasks is trajectory clustering, which consists of automatically grouping similar trajectories into clusters. To perform this task, Density Peak Clustering (DPC) is widely used due to its speed and small number of artificial parameters. However, a major problem is that its performance does not scale well for large datasets. To address this issue, this paper proposes an efficient parallel trajectory clustering algorithm, named Tra-PDPC (Trajectory-Parallel DPC). It is applied in three steps, namely trajectory division and partition, trajectory similarity calculation, and clustering. Those steps are all designed to run in a distributed fashion using the Spark programming model. For the first step, a scheme is proposed to divide sub-trajectories based on local grid area density. Then, a combined similarity measurement method based on Euclidean space and grid space is defined for sub-trajectories similarity calculation. Finally, a version of DPC is applied, which dramatically improves clustering speed. Experiments on multiple large realistic trajectory datasets have demonstrated that the proposed Tra-PDPC algorithm can considerably decrease runtime while providing a high accuracy.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-021-02757-w</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Artificial Intelligence ; Clustering ; Computer Science ; Datasets ; Density ; Electronic devices ; Emerging topics in Applied Intelligence selected from IEA/AIE2020 ; Euclidean geometry ; Euclidean space ; Machines ; Manufacturing ; Measurement methods ; Mechanical Engineering ; Navigation systems ; Processes ; Similarity ; Trajectory analysis ; Unmanned vehicles</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2022-12, Vol.52 (15), p.17042-17057</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3</citedby><cites>FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3</cites><orcidid>0000-0003-4075-5355</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2737808084/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2737808084?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74767</link.rule.ids></links><search><creatorcontrib>Niu, Xinzheng</creatorcontrib><creatorcontrib>Zheng, Yunhong</creatorcontrib><creatorcontrib>Fournier-Viger, Philippe</creatorcontrib><creatorcontrib>Wang, Bing</creatorcontrib><title>Parallel grid-based density peak clustering of big trajectory data</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>With the widespread adoption of data intensive applications such as navigation systems for mobile devices and unmanned vehicles, analyzing trajectory data has become a key research area. One of the main tasks is trajectory clustering, which consists of automatically grouping similar trajectories into clusters. To perform this task, Density Peak Clustering (DPC) is widely used due to its speed and small number of artificial parameters. However, a major problem is that its performance does not scale well for large datasets. To address this issue, this paper proposes an efficient parallel trajectory clustering algorithm, named Tra-PDPC (Trajectory-Parallel DPC). It is applied in three steps, namely trajectory division and partition, trajectory similarity calculation, and clustering. Those steps are all designed to run in a distributed fashion using the Spark programming model. For the first step, a scheme is proposed to divide sub-trajectories based on local grid area density. Then, a combined similarity measurement method based on Euclidean space and grid space is defined for sub-trajectories similarity calculation. Finally, a version of DPC is applied, which dramatically improves clustering speed. Experiments on multiple large realistic trajectory datasets have demonstrated that the proposed Tra-PDPC algorithm can considerably decrease runtime while providing a high accuracy.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Clustering</subject><subject>Computer Science</subject><subject>Datasets</subject><subject>Density</subject><subject>Electronic devices</subject><subject>Emerging topics in Applied Intelligence selected from IEA/AIE2020</subject><subject>Euclidean geometry</subject><subject>Euclidean space</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Measurement methods</subject><subject>Mechanical Engineering</subject><subject>Navigation systems</subject><subject>Processes</subject><subject>Similarity</subject><subject>Trajectory analysis</subject><subject>Unmanned vehicles</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp9kEFLAzEQhYMoWKt_wFPAc3Rms002Ry1ahYIeFLyFbJItW9fumqSU_ntTV_AmwzBzeO8N8xFyiXCNAPImIpSVYlBgbjmTbHdEJjiTnMlSyWMyAVWUTAj1fkrOYlwDAOeAE3L3YoLpOt_RVWgdq030jjq_iW3a08GbD2q7bUw-tJsV7Rtatyuagll7m_qwp84kc05OGtNFf_E7p-Tt4f51_siWz4un-e2SWY4qMQVSesut54KjLMqituhKKQDdDFyNEqGqQaHg3ijRKBB1Yzg64XxT8bxOydWYO4T-a-tj0ut-Gzb5pC4klxXkKrOqGFU29DEG3-ghtJ8m7DWCPrDSIyudWekfVnqXTXw0xeHwqA9_0f-4vgEyMGx2</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Niu, Xinzheng</creator><creator>Zheng, Yunhong</creator><creator>Fournier-Viger, Philippe</creator><creator>Wang, Bing</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0003-4075-5355</orcidid></search><sort><creationdate>20221201</creationdate><title>Parallel grid-based density peak clustering of big trajectory data</title><author>Niu, Xinzheng ; Zheng, Yunhong ; Fournier-Viger, Philippe ; Wang, Bing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Clustering</topic><topic>Computer Science</topic><topic>Datasets</topic><topic>Density</topic><topic>Electronic devices</topic><topic>Emerging topics in Applied Intelligence selected from IEA/AIE2020</topic><topic>Euclidean geometry</topic><topic>Euclidean space</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Measurement methods</topic><topic>Mechanical Engineering</topic><topic>Navigation systems</topic><topic>Processes</topic><topic>Similarity</topic><topic>Trajectory analysis</topic><topic>Unmanned vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Niu, Xinzheng</creatorcontrib><creatorcontrib>Zheng, Yunhong</creatorcontrib><creatorcontrib>Fournier-Viger, Philippe</creatorcontrib><creatorcontrib>Wang, Bing</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Niu, Xinzheng</au><au>Zheng, Yunhong</au><au>Fournier-Viger, Philippe</au><au>Wang, Bing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Parallel grid-based density peak clustering of big trajectory data</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2022-12-01</date><risdate>2022</risdate><volume>52</volume><issue>15</issue><spage>17042</spage><epage>17057</epage><pages>17042-17057</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>With the widespread adoption of data intensive applications such as navigation systems for mobile devices and unmanned vehicles, analyzing trajectory data has become a key research area. One of the main tasks is trajectory clustering, which consists of automatically grouping similar trajectories into clusters. To perform this task, Density Peak Clustering (DPC) is widely used due to its speed and small number of artificial parameters. However, a major problem is that its performance does not scale well for large datasets. To address this issue, this paper proposes an efficient parallel trajectory clustering algorithm, named Tra-PDPC (Trajectory-Parallel DPC). It is applied in three steps, namely trajectory division and partition, trajectory similarity calculation, and clustering. Those steps are all designed to run in a distributed fashion using the Spark programming model. For the first step, a scheme is proposed to divide sub-trajectories based on local grid area density. Then, a combined similarity measurement method based on Euclidean space and grid space is defined for sub-trajectories similarity calculation. Finally, a version of DPC is applied, which dramatically improves clustering speed. Experiments on multiple large realistic trajectory datasets have demonstrated that the proposed Tra-PDPC algorithm can considerably decrease runtime while providing a high accuracy.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-021-02757-w</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0003-4075-5355</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0924-669X
ispartof	Applied intelligence (Dordrecht, Netherlands), 2022-12, Vol.52 (15), p.17042-17057
issn	0924-669X 1573-7497
language	eng
recordid	cdi_proquest_journals_2737808084
source	ABI/INFORM Collection; Springer Link
subjects	Algorithms Artificial Intelligence Clustering Computer Science Datasets Density Electronic devices Emerging topics in Applied Intelligence selected from IEA/AIE2020 Euclidean geometry Euclidean space Machines Manufacturing Measurement methods Mechanical Engineering Navigation systems Processes Similarity Trajectory analysis Unmanned vehicles
title	Parallel grid-based density peak clustering of big trajectory data
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T19%3A29%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Parallel%20grid-based%20density%20peak%20clustering%20of%20big%20trajectory%20data&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Niu,%20Xinzheng&rft.date=2022-12-01&rft.volume=52&rft.issue=15&rft.spage=17042&rft.epage=17057&rft.pages=17042-17057&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-021-02757-w&rft_dat=%3Cproquest_cross%3E2737808084%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-9077ec3ce36317242bc1d47601d50db17108b09163ea96f906bfa31d6def83fa3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2737808084&rft_id=info:pmid/&rfr_iscdi=true