Loading…

Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps

Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on intelligent vehicles 2024-02, Vol.9 (2), p.3827-3842
Main Authors: Zhang, Boquan, Lin, Xiang, Zhu, Yifan, Tian, JING, Zhu, Zhi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3
container_end_page 3842
container_issue 2
container_start_page 3827
container_title IEEE transactions on intelligent vehicles
container_volume 9
creator Zhang, Boquan
Lin, Xiang
Zhu, Yifan
Tian, JING
Zhu, Zhi
description Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.
doi_str_mv 10.1109/TIV.2024.3352581
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TIV_2024_3352581</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10388472</ieee_id><sourcerecordid>3049491014</sourcerecordid><originalsourceid>FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGE4ly9-BhE8_F_SrdPSIgkkAkCnjc7LZTuqS2uNse-PeWgImneZN53pnkQeiBkgGlRD2v59sBI0wMOI9ZLOkV6jGeqEgqIq7_sozlLeqHsCeE0KFkkqgegmlVmCp11Q4v27Jx0Wa0xR-Q1lVlXAjdCrCpMvwJxqcFXhe-bncFntStLQGPvWtciieT1Qx_uabAL1A6yPHK19ZYV7rmiJfmEO7RTW7KAP3LvEOb1-l6_BYt3mfz8WgRpUzETaQoJENCs1gkZsgktYxDF6xRIie5tZRKkSeSZIYIaw1XYE2cZCylkgPnKb9DT-e7B1__tBAava9bX3UvNSdCCUUJFR1FzlTq6xA85Prg3bfxR02JPvnUnU998qkvPrvK47niAOAfzqUUCeO_2_5wMQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3049491014</pqid></control><display><type>article</type><title>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Zhang, Boquan ; Lin, Xiang ; Zhu, Yifan ; Tian, JING ; Zhu, Zhi</creator><creatorcontrib>Zhang, Boquan ; Lin, Xiang ; Zhu, Yifan ; Tian, JING ; Zhu, Zhi</creatorcontrib><description>Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.</description><identifier>ISSN: 2379-8858</identifier><identifier>EISSN: 2379-8904</identifier><identifier>DOI: 10.1109/TIV.2024.3352581</identifier><identifier>CODEN: ITIVBL</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Autonomous aerial vehicles ; belief probability map ; bias and variance ; Deep learning ; double critic deep deterministic policy gradient ; Markov processes ; Multi-UAV ; Multiple objective analysis ; Optimization ; Reconnaissance ; Reconnaissance aircraft ; reconnaissance and search ; Reinforcement learning ; Search problems ; Searching ; Training ; Uncertainty ; Unmanned aerial vehicles</subject><ispartof>IEEE transactions on intelligent vehicles, 2024-02, Vol.9 (2), p.3827-3842</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3</cites><orcidid>0009-0003-8742-3777 ; 0000-0003-3758-8568 ; 0000-0002-3660-8427 ; 0009-0005-3142-1734 ; 0009-0004-5870-928X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10388472$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27923,27924,54795</link.rule.ids></links><search><creatorcontrib>Zhang, Boquan</creatorcontrib><creatorcontrib>Lin, Xiang</creatorcontrib><creatorcontrib>Zhu, Yifan</creatorcontrib><creatorcontrib>Tian, JING</creatorcontrib><creatorcontrib>Zhu, Zhi</creatorcontrib><title>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</title><title>IEEE transactions on intelligent vehicles</title><addtitle>TIV</addtitle><description>Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.</description><subject>Autonomous aerial vehicles</subject><subject>belief probability map</subject><subject>bias and variance</subject><subject>Deep learning</subject><subject>double critic deep deterministic policy gradient</subject><subject>Markov processes</subject><subject>Multi-UAV</subject><subject>Multiple objective analysis</subject><subject>Optimization</subject><subject>Reconnaissance</subject><subject>Reconnaissance aircraft</subject><subject>reconnaissance and search</subject><subject>Reinforcement learning</subject><subject>Search problems</subject><subject>Searching</subject><subject>Training</subject><subject>Uncertainty</subject><subject>Unmanned aerial vehicles</subject><issn>2379-8858</issn><issn>2379-8904</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkE1PwkAQhjdGE4ly9-BhE8_F_SrdPSIgkkAkCnjc7LZTuqS2uNse-PeWgImneZN53pnkQeiBkgGlRD2v59sBI0wMOI9ZLOkV6jGeqEgqIq7_sozlLeqHsCeE0KFkkqgegmlVmCp11Q4v27Jx0Wa0xR-Q1lVlXAjdCrCpMvwJxqcFXhe-bncFntStLQGPvWtciieT1Qx_uabAL1A6yPHK19ZYV7rmiJfmEO7RTW7KAP3LvEOb1-l6_BYt3mfz8WgRpUzETaQoJENCs1gkZsgktYxDF6xRIie5tZRKkSeSZIYIaw1XYE2cZCylkgPnKb9DT-e7B1__tBAava9bX3UvNSdCCUUJFR1FzlTq6xA85Prg3bfxR02JPvnUnU998qkvPrvK47niAOAfzqUUCeO_2_5wMQ</recordid><startdate>20240201</startdate><enddate>20240201</enddate><creator>Zhang, Boquan</creator><creator>Lin, Xiang</creator><creator>Zhu, Yifan</creator><creator>Tian, JING</creator><creator>Zhu, Zhi</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0009-0003-8742-3777</orcidid><orcidid>https://orcid.org/0000-0003-3758-8568</orcidid><orcidid>https://orcid.org/0000-0002-3660-8427</orcidid><orcidid>https://orcid.org/0009-0005-3142-1734</orcidid><orcidid>https://orcid.org/0009-0004-5870-928X</orcidid></search><sort><creationdate>20240201</creationdate><title>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</title><author>Zhang, Boquan ; Lin, Xiang ; Zhu, Yifan ; Tian, JING ; Zhu, Zhi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Autonomous aerial vehicles</topic><topic>belief probability map</topic><topic>bias and variance</topic><topic>Deep learning</topic><topic>double critic deep deterministic policy gradient</topic><topic>Markov processes</topic><topic>Multi-UAV</topic><topic>Multiple objective analysis</topic><topic>Optimization</topic><topic>Reconnaissance</topic><topic>Reconnaissance aircraft</topic><topic>reconnaissance and search</topic><topic>Reinforcement learning</topic><topic>Search problems</topic><topic>Searching</topic><topic>Training</topic><topic>Uncertainty</topic><topic>Unmanned aerial vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Boquan</creatorcontrib><creatorcontrib>Lin, Xiang</creatorcontrib><creatorcontrib>Zhu, Yifan</creatorcontrib><creatorcontrib>Tian, JING</creatorcontrib><creatorcontrib>Zhu, Zhi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Explore</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on intelligent vehicles</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Boquan</au><au>Lin, Xiang</au><au>Zhu, Yifan</au><au>Tian, JING</au><au>Zhu, Zhi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps</atitle><jtitle>IEEE transactions on intelligent vehicles</jtitle><stitle>TIV</stitle><date>2024-02-01</date><risdate>2024</risdate><volume>9</volume><issue>2</issue><spage>3827</spage><epage>3842</epage><pages>3827-3842</pages><issn>2379-8858</issn><eissn>2379-8904</eissn><coden>ITIVBL</coden><abstract>Unmanned Aerial Vehicles (UAVs) have recently attracted significant attention due to their potential applications in reconnaissance and search. This paper aims to investigate the issue of multi-UAV cooperative reconnaissance and search (MCRS) to ensure ample coverage of the mission area and precise localization of static targets. The MCRS problem is modeled as a multi-objective optimization problem, taking into account the credibility of search results. To achieve this, we design a belief probability map based on the Dempster-Shafer (DS) evidence theory, comprising an uncertainty map and two target maps. This representation enables a clear depiction of both the presence of the target and the uncertainty within the map. Subsequently, we reformulate this multi-objective optimization problem within the framework of Decentralized Partially Observable Markov Decision Process (Dec-POMDP). To address this reformulation, a new deep reinforcement learning approach called Double Critic Deep Deterministic Policy Gradient (DCDDPG) is proposed. Specifically, we introduce both a centralized critic and a local critic for each UAV agent to estimate the action-value function. This approach helps balance the bias in the action-value function estimation and the variance in the policy updates, thereby improving the coordination effect. Extensive simulation results demonstrate that DCDDPG outperforms existing techniques in terms of search efficiency and coverage.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TIV.2024.3352581</doi><tpages>16</tpages><orcidid>https://orcid.org/0009-0003-8742-3777</orcidid><orcidid>https://orcid.org/0000-0003-3758-8568</orcidid><orcidid>https://orcid.org/0000-0002-3660-8427</orcidid><orcidid>https://orcid.org/0009-0005-3142-1734</orcidid><orcidid>https://orcid.org/0009-0004-5870-928X</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2379-8858
ispartof IEEE transactions on intelligent vehicles, 2024-02, Vol.9 (2), p.3827-3842
issn 2379-8858
2379-8904
language eng
recordid cdi_crossref_primary_10_1109_TIV_2024_3352581
source IEEE Electronic Library (IEL) Journals
subjects Autonomous aerial vehicles
belief probability map
bias and variance
Deep learning
double critic deep deterministic policy gradient
Markov processes
Multi-UAV
Multiple objective analysis
Optimization
Reconnaissance
Reconnaissance aircraft
reconnaissance and search
Reinforcement learning
Search problems
Searching
Training
Uncertainty
Unmanned aerial vehicles
title Enhancing Multi-UAV Reconnaissance and Search Through Double Critic DDPG With Belief Probability Maps
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T16%3A41%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20Multi-UAV%20Reconnaissance%20and%20Search%20Through%20Double%20Critic%20DDPG%20With%20Belief%20Probability%20Maps&rft.jtitle=IEEE%20transactions%20on%20intelligent%20vehicles&rft.au=Zhang,%20Boquan&rft.date=2024-02-01&rft.volume=9&rft.issue=2&rft.spage=3827&rft.epage=3842&rft.pages=3827-3842&rft.issn=2379-8858&rft.eissn=2379-8904&rft.coden=ITIVBL&rft_id=info:doi/10.1109/TIV.2024.3352581&rft_dat=%3Cproquest_cross%3E3049491014%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c245t-91e7601d547a6281b23ea62ba94f0fbb1184f780da04bba39eba57d2c183e33c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3049491014&rft_id=info:pmid/&rft_ieee_id=10388472&rfr_iscdi=true