Loading…

An Adaptive Conversion Speed Q-Learning Algorithm for Search and Rescue UAV Path Planning in Unknown Environments

With the wide application of unmanned aerial vehicles (UAVs), performing search and rescue missions autonomously in unknown environment has become an increasingly concerning issue. In this paper, we propose an adaptive conversion speed Q-Learning algorithm (ACSQL). Performing UAV missions autonomous...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on vehicular technology 2023-12, Vol.72 (12), p.1-14
Main Authors: Wu, Jiehong, Sun, Ya'nan, Li, Danyang, Shi, Junling, Li, Xianwei, Gao, Lijun, Yu, Lei, Han, Guangjie, Wu, Jinsong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the wide application of unmanned aerial vehicles (UAVs), performing search and rescue missions autonomously in unknown environment has become an increasingly concerning issue. In this paper, we propose an adaptive conversion speed Q-Learning algorithm (ACSQL). Performing UAV missions autonomously is divided into two stages: rescue mission search stage and optimal path search stage. In the first stage, a UAV can find task points as soon as possible, and the efficiency of exploration is increased by adaptively adjusting the speed of the UAV. In the second stage, to get a secure and short path, we propose a subdomain search algorithm. Based on the above two stages, we improve state space and action space in reinforcement learning, and design a composite reward function, finally obtain the path of UAV to perform multiple search and rescue missions through this algorithm. In order to solve the problems of slow training convergence and high uncertainty, we initialize the Q-table by combining detection information of UAV sensors in first stage. Simulation results show that ACSQL algorithm can realize autonomous navigation and path planning of UAV in an unknown environment. Compared with traditional action space, the learning process of UAV converges faster and more stable, and it can converge in about 30 episodes. Compared with DDPG algorithm and IDWA algorithm in different scenarios, ACSQL algorithm has the shortest path length. Finally, ACSQL algorithm is verified by UAV simulator Airsim.
ISSN:0018-9545
1939-9359
DOI:10.1109/TVT.2023.3297837