Loading…

Target tracking strategy using deep deterministic policy gradient

To address the challenge of maintaining high robustness of target tracking in a 3D dynamic high-altitude scenario, this paper presents a method to formulate continuous strategic maneuvers for unmanned combat air vehicles (UCAVs) based on deep deterministic policy gradient (DDPG). DDPG is an efficien...

Full description

Saved in:

Bibliographic Details
Published in:	Applied soft computing 2020-10, Vol.95, p.106490, Article 106490
Main Authors:	You, Shixun, Diao, Ming, Gao, Lipeng, Zhang, Fulong, Wang, Huan
Format:	Article
Language:	English
Subjects:	Cognitive electronic warfare Deep deterministic policy gradient Motion planning Reinforcement learning Target tracking
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	To address the challenge of maintaining high robustness of target tracking in a 3D dynamic high-altitude scenario, this paper presents a method to formulate continuous strategic maneuvers for unmanned combat air vehicles (UCAVs) based on deep deterministic policy gradient (DDPG). DDPG is an efficient reinforcement learning approach that helps UCAV perform a variety of navigation tasks in real-time in a dynamic and random electronic warfare environment, and therefore possesses clear advantages over other technologies. First, create a target tracking simulator, Tracker, in the cognitive electronic warfare framework, and conduct a theoretical analysis of maneuvering bias produced by environmental observational errors. Tracker can automatically correlate the maximum physical overload with UCAV’s attitude angles and desired movement commands. Second, shape the agent’s behavior rewards under the inspiration of vector-based navigation to ensure that the DDPG’s output is reliable. Finally, a DRL-based navigation decision framework is employed to validate the simulation for target tracking tasks in different environments and bring excellent results. In terms of behavior assessment, the agile maneuvers mastered by the agent are dissected by time segmentation of high-quality trajectories. •Adaptive high-level control strategy of unmanned combat air vehicle (UCAV).•Deep reinforcement learning to solve a continuous tracking problem.•Novel target tracking simulator to test UCAV’s agile maneuvering capability.
ISSN:	1568-4946 1872-9681
DOI:	10.1016/j.asoc.2020.106490