Loading…

Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors

In this article, we consider an iterative adaptive dynamic programming (ADP) algorithm within the Hamiltonian-driven framework to solve the Hamilton-Jacobi-Bellman (HJB) equation for the infinite-horizon optimal control problem in continuous time for nonlinear systems. First, a novel function, "...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on cybernetics 2022-12, Vol.52 (12), p.13762-13773
Main Authors: Yang, Yongliang, Modares, Hamidreza, Vamvoudakis, Kyriakos G., He, Wei, Xu, Cheng-Zhong, Wunsch, Donald C.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63
cites cdi_FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63
container_end_page 13773
container_issue 12
container_start_page 13762
container_title IEEE transactions on cybernetics
container_volume 52
creator Yang, Yongliang
Modares, Hamidreza
Vamvoudakis, Kyriakos G.
He, Wei
Xu, Cheng-Zhong
Wunsch, Donald C.
description In this article, we consider an iterative adaptive dynamic programming (ADP) algorithm within the Hamiltonian-driven framework to solve the Hamilton-Jacobi-Bellman (HJB) equation for the infinite-horizon optimal control problem in continuous time for nonlinear systems. First, a novel function, "min-Hamiltonian," is defined to capture the fundamental properties of the classical Hamiltonian. It is shown that both the HJB equation and the policy iteration (PI) algorithm can be formulated in terms of the min-Hamiltonian within the Hamiltonian-driven framework. Moreover, we develop an iterative ADP algorithm that takes into consideration the approximation errors during the policy evaluation step. We then derive a sufficient condition on the iterative value gradient to guarantee closed-loop stability of the equilibrium point as well as convergence to the optimal value. A model-free extension based on an off-policy reinforcement learning (RL) technique is also provided. Finally, numerical results illustrate the efficacy of the proposed framework.
doi_str_mv 10.1109/TCYB.2021.3108034
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_9531448</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9531448</ieee_id><sourcerecordid>2571055820</sourcerecordid><originalsourceid>FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63</originalsourceid><addsrcrecordid>eNpdkE1PAjEQhhujEYL8AONlEy9eFvv9cURAISHRA8Z4arq7XSzZD2wXI__eEggH5zKTmWcm77wA3CI4Qgiqx9Xk82mEIUYjgqCEhF6APkZcphgLdnmuueiBYQgbGEPGlpLXoEcoVUxy2geLuald1bWNM0069e7HNsm4MNsuVsl038Rpnrz5du1NXbtmnXy47isZb7e-_XW16VzbJDPvWx9uwFVpqmCHpzwA78-z1WSeLl9fFpPxMs0J5l2qWFYWklFeEsoIpAJlMqdKGciFgEJmGSmMEUKxXOKMcSSxIjwvrJRUMMPJADwc70YJ3zsbOl27kNuqMo1td0FjJhBkTGIY0ft_6Kbd-Saq01gQwThXkEQKHanctyF4W-qtj6_5vUZQH6zWB6v1wWp9sjru3B13nLX2zCtGEKWS_AFFsHZZ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2737566903</pqid></control><display><type>article</type><title>Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Yang, Yongliang ; Modares, Hamidreza ; Vamvoudakis, Kyriakos G. ; He, Wei ; Xu, Cheng-Zhong ; Wunsch, Donald C.</creator><creatorcontrib>Yang, Yongliang ; Modares, Hamidreza ; Vamvoudakis, Kyriakos G. ; He, Wei ; Xu, Cheng-Zhong ; Wunsch, Donald C.</creatorcontrib><description>In this article, we consider an iterative adaptive dynamic programming (ADP) algorithm within the Hamiltonian-driven framework to solve the Hamilton-Jacobi-Bellman (HJB) equation for the infinite-horizon optimal control problem in continuous time for nonlinear systems. First, a novel function, "min-Hamiltonian," is defined to capture the fundamental properties of the classical Hamiltonian. It is shown that both the HJB equation and the policy iteration (PI) algorithm can be formulated in terms of the min-Hamiltonian within the Hamiltonian-driven framework. Moreover, we develop an iterative ADP algorithm that takes into consideration the approximation errors during the policy evaluation step. We then derive a sufficient condition on the iterative value gradient to guarantee closed-loop stability of the equilibrium point as well as convergence to the optimal value. A model-free extension based on an off-policy reinforcement learning (RL) technique is also provided. Finally, numerical results illustrate the efficacy of the proposed framework.</description><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TCYB.2021.3108034</identifier><identifier>PMID: 34495864</identifier><identifier>CODEN: ITCEB8</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Approximation ; Approximation error ; Costs ; Dynamic programming ; Errors ; Hamilton-Jacobi-Bellman (HJB) equation ; Hamiltonian-driven framework ; inexact adaptive dynamic programming (ADP) ; Iterative algorithms ; Iterative methods ; Mathematical analysis ; Nonlinear systems ; Optimal control ; Stability analysis</subject><ispartof>IEEE transactions on cybernetics, 2022-12, Vol.52 (12), p.13762-13773</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63</citedby><cites>FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63</cites><orcidid>0000-0002-9726-9051 ; 0000-0003-1978-4848 ; 0000-0003-0800-5140 ; 0000-0002-8944-9861 ; 0000-0001-9480-0356 ; 0000-0002-3144-8604</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9531448$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Yang, Yongliang</creatorcontrib><creatorcontrib>Modares, Hamidreza</creatorcontrib><creatorcontrib>Vamvoudakis, Kyriakos G.</creatorcontrib><creatorcontrib>He, Wei</creatorcontrib><creatorcontrib>Xu, Cheng-Zhong</creatorcontrib><creatorcontrib>Wunsch, Donald C.</creatorcontrib><title>Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors</title><title>IEEE transactions on cybernetics</title><addtitle>TCYB</addtitle><description>In this article, we consider an iterative adaptive dynamic programming (ADP) algorithm within the Hamiltonian-driven framework to solve the Hamilton-Jacobi-Bellman (HJB) equation for the infinite-horizon optimal control problem in continuous time for nonlinear systems. First, a novel function, "min-Hamiltonian," is defined to capture the fundamental properties of the classical Hamiltonian. It is shown that both the HJB equation and the policy iteration (PI) algorithm can be formulated in terms of the min-Hamiltonian within the Hamiltonian-driven framework. Moreover, we develop an iterative ADP algorithm that takes into consideration the approximation errors during the policy evaluation step. We then derive a sufficient condition on the iterative value gradient to guarantee closed-loop stability of the equilibrium point as well as convergence to the optimal value. A model-free extension based on an off-policy reinforcement learning (RL) technique is also provided. Finally, numerical results illustrate the efficacy of the proposed framework.</description><subject>Algorithms</subject><subject>Approximation</subject><subject>Approximation error</subject><subject>Costs</subject><subject>Dynamic programming</subject><subject>Errors</subject><subject>Hamilton-Jacobi-Bellman (HJB) equation</subject><subject>Hamiltonian-driven framework</subject><subject>inexact adaptive dynamic programming (ADP)</subject><subject>Iterative algorithms</subject><subject>Iterative methods</subject><subject>Mathematical analysis</subject><subject>Nonlinear systems</subject><subject>Optimal control</subject><subject>Stability analysis</subject><issn>2168-2267</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNpdkE1PAjEQhhujEYL8AONlEy9eFvv9cURAISHRA8Z4arq7XSzZD2wXI__eEggH5zKTmWcm77wA3CI4Qgiqx9Xk82mEIUYjgqCEhF6APkZcphgLdnmuueiBYQgbGEPGlpLXoEcoVUxy2geLuald1bWNM0069e7HNsm4MNsuVsl038Rpnrz5du1NXbtmnXy47isZb7e-_XW16VzbJDPvWx9uwFVpqmCHpzwA78-z1WSeLl9fFpPxMs0J5l2qWFYWklFeEsoIpAJlMqdKGciFgEJmGSmMEUKxXOKMcSSxIjwvrJRUMMPJADwc70YJ3zsbOl27kNuqMo1td0FjJhBkTGIY0ft_6Kbd-Saq01gQwThXkEQKHanctyF4W-qtj6_5vUZQH6zWB6v1wWp9sjru3B13nLX2zCtGEKWS_AFFsHZZ</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Yang, Yongliang</creator><creator>Modares, Hamidreza</creator><creator>Vamvoudakis, Kyriakos G.</creator><creator>He, Wei</creator><creator>Xu, Cheng-Zhong</creator><creator>Wunsch, Donald C.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-9726-9051</orcidid><orcidid>https://orcid.org/0000-0003-1978-4848</orcidid><orcidid>https://orcid.org/0000-0003-0800-5140</orcidid><orcidid>https://orcid.org/0000-0002-8944-9861</orcidid><orcidid>https://orcid.org/0000-0001-9480-0356</orcidid><orcidid>https://orcid.org/0000-0002-3144-8604</orcidid></search><sort><creationdate>20221201</creationdate><title>Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors</title><author>Yang, Yongliang ; Modares, Hamidreza ; Vamvoudakis, Kyriakos G. ; He, Wei ; Xu, Cheng-Zhong ; Wunsch, Donald C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Approximation</topic><topic>Approximation error</topic><topic>Costs</topic><topic>Dynamic programming</topic><topic>Errors</topic><topic>Hamilton-Jacobi-Bellman (HJB) equation</topic><topic>Hamiltonian-driven framework</topic><topic>inexact adaptive dynamic programming (ADP)</topic><topic>Iterative algorithms</topic><topic>Iterative methods</topic><topic>Mathematical analysis</topic><topic>Nonlinear systems</topic><topic>Optimal control</topic><topic>Stability analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Yongliang</creatorcontrib><creatorcontrib>Modares, Hamidreza</creatorcontrib><creatorcontrib>Vamvoudakis, Kyriakos G.</creatorcontrib><creatorcontrib>He, Wei</creatorcontrib><creatorcontrib>Xu, Cheng-Zhong</creatorcontrib><creatorcontrib>Wunsch, Donald C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE/IET Electronic Library</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Yongliang</au><au>Modares, Hamidreza</au><au>Vamvoudakis, Kyriakos G.</au><au>He, Wei</au><au>Xu, Cheng-Zhong</au><au>Wunsch, Donald C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TCYB</stitle><date>2022-12-01</date><risdate>2022</risdate><volume>52</volume><issue>12</issue><spage>13762</spage><epage>13773</epage><pages>13762-13773</pages><issn>2168-2267</issn><eissn>2168-2275</eissn><coden>ITCEB8</coden><abstract>In this article, we consider an iterative adaptive dynamic programming (ADP) algorithm within the Hamiltonian-driven framework to solve the Hamilton-Jacobi-Bellman (HJB) equation for the infinite-horizon optimal control problem in continuous time for nonlinear systems. First, a novel function, "min-Hamiltonian," is defined to capture the fundamental properties of the classical Hamiltonian. It is shown that both the HJB equation and the policy iteration (PI) algorithm can be formulated in terms of the min-Hamiltonian within the Hamiltonian-driven framework. Moreover, we develop an iterative ADP algorithm that takes into consideration the approximation errors during the policy evaluation step. We then derive a sufficient condition on the iterative value gradient to guarantee closed-loop stability of the equilibrium point as well as convergence to the optimal value. A model-free extension based on an off-policy reinforcement learning (RL) technique is also provided. Finally, numerical results illustrate the efficacy of the proposed framework.</abstract><cop>Piscataway</cop><pub>IEEE</pub><pmid>34495864</pmid><doi>10.1109/TCYB.2021.3108034</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-9726-9051</orcidid><orcidid>https://orcid.org/0000-0003-1978-4848</orcidid><orcidid>https://orcid.org/0000-0003-0800-5140</orcidid><orcidid>https://orcid.org/0000-0002-8944-9861</orcidid><orcidid>https://orcid.org/0000-0001-9480-0356</orcidid><orcidid>https://orcid.org/0000-0002-3144-8604</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2168-2267
ispartof IEEE transactions on cybernetics, 2022-12, Vol.52 (12), p.13762-13773
issn 2168-2267
2168-2275
language eng
recordid cdi_ieee_primary_9531448
source IEEE Electronic Library (IEL) Journals
subjects Algorithms
Approximation
Approximation error
Costs
Dynamic programming
Errors
Hamilton-Jacobi-Bellman (HJB) equation
Hamiltonian-driven framework
inexact adaptive dynamic programming (ADP)
Iterative algorithms
Iterative methods
Mathematical analysis
Nonlinear systems
Optimal control
Stability analysis
title Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T10%3A49%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hamiltonian-Driven%20Adaptive%20Dynamic%20Programming%20With%20Approximation%20Errors&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Yang,%20Yongliang&rft.date=2022-12-01&rft.volume=52&rft.issue=12&rft.spage=13762&rft.epage=13773&rft.pages=13762-13773&rft.issn=2168-2267&rft.eissn=2168-2275&rft.coden=ITCEB8&rft_id=info:doi/10.1109/TCYB.2021.3108034&rft_dat=%3Cproquest_ieee_%3E2571055820%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c326t-95bfd8546f34530471b8c499a0677078bb3daa7795c82b56182936cde88475a63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2737566903&rft_id=info:pmid/34495864&rft_ieee_id=9531448&rfr_iscdi=true