Loading…

Missile guidance with assisted deep reinforcement learning for head-on interception of maneuvering target

In missile guidance, pursuit performance is seriously degraded due to the uncertainty and randomness in target maneuverability, detection delay, and environmental noise. In many methods, accurately estimating the acceleration of the target or the time-to-go is needed to intercept the maneuvering tar...

Full description

Saved in:

Bibliographic Details
Published in:	Complex & intelligent systems 2022-04, Vol.8 (2), p.1205-1216
Main Authors:	Li, Weifan, Zhu, Yuanheng, Zhao, Dongbin
Format:	Article
Language:	English
Subjects:	Acceleration Algorithms Background noise Cognitive tasks Complexity Computational Intelligence Computer simulation Data Structures and Information Theory Deep learning Engineering Guidance systems Interception Machine learning Maneuverability Maneuvering targets Missile control Neural networks Optimization Original Article Performance degradation Target detection Uncertainty
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In missile guidance, pursuit performance is seriously degraded due to the uncertainty and randomness in target maneuverability, detection delay, and environmental noise. In many methods, accurately estimating the acceleration of the target or the time-to-go is needed to intercept the maneuvering target, which is hard in an environment with uncertainty. In this paper, we propose an assisted deep reinforcement learning (ARL) algorithm to optimize the neural network-based missile guidance controller for head-on interception. Based on the relative velocity, distance, and angle, ARL can control the missile to intercept the maneuvering target and achieve large terminal intercept angle. To reduce the influence of environmental uncertainty, ARL predicts the target’s acceleration as an auxiliary supervised task. The supervised learning task improves the ability of the agent to extract information from observations. To exploit the agent’s good trajectories, ARL presents the Gaussian self-imitation learning to make the mean of action distribution approach the agent’s good actions. Compared with vanilla self-imitation learning, Gaussian self-imitation learning improves the exploration in continuous control. Simulation results validate that ARL outperforms traditional methods and proximal policy optimization algorithm with higher hit rate and larger terminal intercept angle in the simulation environment with noise, delay, and maneuverable target.
ISSN:	2199-4536 2198-6053
DOI:	10.1007/s40747-021-00577-6