Loading…
Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps
Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise...
Saved in:
Published in: | IEEE transactions on intelligent vehicles 2024-02, Vol.9 (2), p.3827-3842 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3 |
container_end_page | 3842 |
container_issue | 2 |
container_start_page | 3827 |
container_title | IEEE transactions on intelligent vehicles |
container_volume | 9 |
creator | Zhang, Boquan Lin, Xiang Zhu, Yifan Tian, JING Zhu, Zhi |
description | Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage. |
doi_str_mv | 10.1109/TIV.2024.3352581 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIV_2024_3352581</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10388472</ieee_id><sourcerecordid>3049491014</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGE4ly9-BhE8_F_SrdPSIgkkAkCnjc7LZTuqS2uNse-PeWgImneZN53pnkQeiBkgGlRD2v59sBI0wMOI9ZLOkV6jGeqEgqIq7_sozlLeqHsCeE0KFkkqgegmlVmCp11Q4v27Jx0Wa0xR-Q1lVlXAjdCrCpMvwJxqcFXhe-bncFntStLQGPvWtciieT1Qx_uabAL1A6yPHK19ZYV7rmiJfmEO7RTW7KAP3LvEOb1-l6_BYt3mfz8WgRpUzETaQoJENCs1gkZsgktYxDF6xRIie5tZRKkSeSZIYIaw1XYE2cZCylkgPnKb9DT-e7B1__tBAava9bX3UvNSdCCUUJFR1FzlTq6xA85Prg3bfxR02JPvnUnU998qkvPrvK47niAOAfzqUUCeO_2_5wMQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3049491014</pqid></control><display><type>article</type><title>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Zhang, Boquan ; Lin, Xiang ; Zhu, Yifan ; Tian, JING ; Zhu, Zhi</creator><creatorcontrib>Zhang, Boquan ; Lin, Xiang ; Zhu, Yifan ; Tian, JING ; Zhu, Zhi</creatorcontrib><description>Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.</description><identifier>ISSN: 2379-8858</identifier><identifier>EISSN: 2379-8904</identifier><identifier>DOI: 10.1109/TIV.2024.3352581</identifier><identifier>CODEN: ITIVBL</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Autonomous aerial vehicles ; belief probability map ; bias and variance ; Deep learning ; double critic deep deterministic policy gradient ; Markov processes ; Multi-UAV ; Multiple objective analysis ; Optimization ; Reconnaissance ; Reconnaissance aircraft ; reconnaissance and search ; Reinforcement learning ; Search problems ; Searching ; Training ; Uncertainty ; Unmanned aerial vehicles</subject><ispartof>IEEE transactions on intelligent vehicles, 2024-02, Vol.9 (2), p.3827-3842</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3</cites><orcidid>0009-0003-8742-3777 ; 0000-0003-3758-8568 ; 0000-0002-3660-8427 ; 0009-0005-3142-1734 ; 0009-0004-5870-928X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10388472$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27923,27924,54795</link.rule.ids></links><search><creatorcontrib>Zhang, Boquan</creatorcontrib><creatorcontrib>Lin, Xiang</creatorcontrib><creatorcontrib>Zhu, Yifan</creatorcontrib><creatorcontrib>Tian, JING</creatorcontrib><creatorcontrib>Zhu, Zhi</creatorcontrib><title>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</title><title>IEEE transactions on intelligent vehicles</title><addtitle>TIV</addtitle><description>Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.</description><subject>Autonomous aerial vehicles</subject><subject>belief probability map</subject><subject>bias and variance</subject><subject>Deep learning</subject><subject>double critic deep deterministic policy gradient</subject><subject>Markov processes</subject><subject>Multi-UAV</subject><subject>Multiple objective analysis</subject><subject>Optimization</subject><subject>Reconnaissance</subject><subject>Reconnaissance aircraft</subject><subject>reconnaissance and search</subject><subject>Reinforcement learning</subject><subject>Search problems</subject><subject>Searching</subject><subject>Training</subject><subject>Uncertainty</subject><subject>Unmanned aerial vehicles</subject><issn>2379-8858</issn><issn>2379-8904</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkE1PwkAQhjdGE4ly9-BhE8_F_SrdPSIgkkAkCnjc7LZTuqS2uNse-PeWgImneZN53pnkQeiBkgGlRD2v59sBI0wMOI9ZLOkV6jGeqEgqIq7_sozlLeqHsCeE0KFkkqgegmlVmCp11Q4v27Jx0Wa0xR-Q1lVlXAjdCrCpMvwJxqcFXhe-bncFntStLQGPvWtciieT1Qx_uabAL1A6yPHK19ZYV7rmiJfmEO7RTW7KAP3LvEOb1-l6_BYt3mfz8WgRpUzETaQoJENCs1gkZsgktYxDF6xRIie5tZRKkSeSZIYIaw1XYE2cZCylkgPnKb9DT-e7B1__tBAava9bX3UvNSdCCUUJFR1FzlTq6xA85Prg3bfxR02JPvnUnU998qkvPrvK47niAOAfzqUUCeO_2_5wMQ</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Zhang, Boquan</creator><creator>Lin, Xiang</creator><creator>Zhu, Yifan</creator><creator>Tian, JING</creator><creator>Zhu, Zhi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0003-8742-3777</orcidid><orcidid>https://orcid.org/0000-0003-3758-8568</orcidid><orcidid>https://orcid.org/0000-0002-3660-8427</orcidid><orcidid>https://orcid.org/0009-0005-3142-1734</orcidid><orcidid>https://orcid.org/0009-0004-5870-928X</orcidid></search><sort><creationdate>20240201</creationdate><title>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</title><author>Zhang, Boquan ; Lin, Xiang ; Zhu, Yifan ; Tian, JING ; Zhu, Zhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Autonomous aerial vehicles</topic><topic>belief probability map</topic><topic>bias and variance</topic><topic>Deep learning</topic><topic>double critic deep deterministic policy gradient</topic><topic>Markov processes</topic><topic>Multi-UAV</topic><topic>Multiple objective analysis</topic><topic>Optimization</topic><topic>Reconnaissance</topic><topic>Reconnaissance aircraft</topic><topic>reconnaissance and search</topic><topic>Reinforcement learning</topic><topic>Search problems</topic><topic>Searching</topic><topic>Training</topic><topic>Uncertainty</topic><topic>Unmanned aerial vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Boquan</creatorcontrib><creatorcontrib>Lin, Xiang</creatorcontrib><creatorcontrib>Zhu, Yifan</creatorcontrib><creatorcontrib>Tian, JING</creatorcontrib><creatorcontrib>Zhu, Zhi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Explore</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on intelligent vehicles</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Boquan</au><au>Lin, Xiang</au><au>Zhu, Yifan</au><au>Tian, JING</au><au>Zhu, Zhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</atitle><jtitle>IEEE transactions on intelligent vehicles</jtitle><stitle>TIV</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>9</volume><issue>2</issue><spage>3827</spage><epage>3842</epage><pages>3827-3842</pages><issn>2379-8858</issn><eissn>2379-8904</eissn><coden>ITIVBL</coden><abstract>Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TIV.2024.3352581</doi><tpages>16</tpages><orcidid>https://orcid.org/0009-0003-8742-3777</orcidid><orcidid>https://orcid.org/0000-0003-3758-8568</orcidid><orcidid>https://orcid.org/0000-0002-3660-8427</orcidid><orcidid>https://orcid.org/0009-0005-3142-1734</orcidid><orcidid>https://orcid.org/0009-0004-5870-928X</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2379-8858 |
ispartof | IEEE transactions on intelligent vehicles, 2024-02, Vol.9 (2), p.3827-3842 |
issn | 2379-8858 2379-8904 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TIV_2024_3352581 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Autonomous aerial vehicles belief probability map bias and variance Deep learning double critic deep deterministic policy gradient Markov processes Multi-UAV Multiple objective analysis Optimization Reconnaissance Reconnaissance aircraft reconnaissance and search Reinforcement learning Search problems Searching Training Uncertainty Unmanned aerial vehicles |
title | Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T16%3A41%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20Multi-UAV%20Reconnaissance%20and%20Search%20Through%20Double%20Critic%20DDPG%20With%20Belief%20Probability%20Maps&rft.jtitle=IEEE%20transactions%20on%20intelligent%20vehicles&rft.au=Zhang,%20Boquan&rft.date=2024-02-01&rft.volume=9&rft.issue=2&rft.spage=3827&rft.epage=3842&rft.pages=3827-3842&rft.issn=2379-8858&rft.eissn=2379-8904&rft.coden=ITIVBL&rft_id=info:doi/10.1109/TIV.2024.3352581&rft_dat=%3Cproquest_cross%3E3049491014%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3049491014&rft_id=info:pmid/&rft_ieee_id=10388472&rfr_iscdi=true |