Loading…

Hamiltonian-Driven Adaptive Dynamic Programming With Efficient Experience Replay

This article presents a novel efficient experience-replay-based adaptive dynamic programming (ADP) for the optimal control problem of a class of nonlinear dynamical systems within the Hamiltonian-driven framework. The quasi-Hamiltonian is presented for the policy evaluation problem with an admissibl...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transaction on neural networks and learning systems 2024-03, Vol.35 (3), p.1-13
Main Authors:	Yang, Yongliang, Pan, Yongping, Xu, Cheng-Zhong, Wunsch, Donald C.
Format:	Article
Language:	English
Subjects:	Adaptive control Closed loops Convergence Dynamic programming Dynamical systems Feedback control Hamiltonian-driven adaptive dynamic programming (ADP) Hamilton–Jacobi–Bellman (HJB) equation Iterative algorithms Learning systems Mathematical models Nonlinear systems Optimal control Optimization pseudo-Hamiltonian quasi-Hamiltonian relaxed excitation condition Theoretical analysis
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This article presents a novel efficient experience-replay-based adaptive dynamic programming (ADP) for the optimal control problem of a class of nonlinear dynamical systems within the Hamiltonian-driven framework. The quasi-Hamiltonian is presented for the policy evaluation problem with an admissible policy. With the quasi-Hamiltonian, a novel composite critic learning mechanism is developed to combine the instantaneous data with the historical data. In addition, the pseudo-Hamiltonian is defined to deal with the performance optimization problem. Based on the pseudo-Hamiltonian, the conventional Hamilton-Jacobi-Bellman (HJB) equation can be represented in a filtered form, which can be implemented online. Theoretical analysis is investigated in terms of the convergence of the adaptive critic design and the stability of the closed-loop systems, where parameter convergence can be achieved under a weakened excitation condition. Simulation studies are investigated to verify the efficacy of the presented design scheme.
ISSN:	2162-237X 2162-2388
DOI:	10.1109/TNNLS.2022.3213566