Loading…
Online and offline learning of player objectives from partial observations in dynamic games
Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such tec...
Saved in:
Published in: | The International journal of robotics research 2023-09, Vol.42 (10), p.917-937 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53 |
---|---|
cites | cdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53 |
container_end_page | 937 |
container_issue | 10 |
container_start_page | 917 |
container_title | The International journal of robotics research |
container_volume | 42 |
creator | Peters, Lasse Rubies-Royo, Vicenç Tomlin, Claire J Ferranti, Laura Alonso-Mora, Javier Stachniss, Cyrill Fridovich-Keil, David |
description | Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches. |
doi_str_mv | 10.1177/02783649231182453 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2883252607</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_02783649231182453</sage_id><sourcerecordid>2883252607</sourcerecordid><originalsourceid>FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</originalsourceid><addsrcrecordid>eNp1UEtLw0AQXkTBWv0B3hY8p-4zmxyl-IJCL3ryEKbJbNmS7MbdtJB_b2oFD-Jphm--x_ARcsvZgnNj7pkwhcxVKSTnhVBanpEZN4pnkpv8nMyO9-xIuCRXKe0YYzJn5Yx8rH3rPFLwDQ3Wfu8tQvTObyeA9i2MGGnY7LAe3AETtTF0tIc4OGgnPGE8wOCCT9R52oweOlfTLXSYrsmFhTbhzc-ck_enx7flS7ZaP78uH1ZZLbUeMlVuSg2KidpaYS3nqHVpRWGnB4VsANVGYSkbrXkNGkDljVUMRSMNAoKWc3J38u1j-NxjGqpd2Ec_RVaiKKTQImdmYvETq44hpYi26qPrII4VZ9Wxw-pPh5NmcdIk2OKv6_-CL1XzckM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2883252607</pqid></control><display><type>article</type><title>Online and offline learning of player objectives from partial observations in dynamic games</title><source>Sage Journals Online</source><creator>Peters, Lasse ; Rubies-Royo, Vicenç ; Tomlin, Claire J ; Ferranti, Laura ; Alonso-Mora, Javier ; Stachniss, Cyrill ; Fridovich-Keil, David</creator><creatorcontrib>Peters, Lasse ; Rubies-Royo, Vicenç ; Tomlin, Claire J ; Ferranti, Laura ; Alonso-Mora, Javier ; Stachniss, Cyrill ; Fridovich-Keil, David</creatorcontrib><description>Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.</description><identifier>ISSN: 0278-3649</identifier><identifier>EISSN: 1741-3176</identifier><identifier>DOI: 10.1177/02783649231182453</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Collision avoidance ; Coupling ; Distance learning ; Estimates ; Game theory ; Players</subject><ispartof>The International journal of robotics research, 2023-09, Vol.42 (10), p.917-937</ispartof><rights>The Author(s) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</citedby><cites>FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</cites><orcidid>0000-0001-9008-7127</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904,79111</link.rule.ids></links><search><creatorcontrib>Peters, Lasse</creatorcontrib><creatorcontrib>Rubies-Royo, Vicenç</creatorcontrib><creatorcontrib>Tomlin, Claire J</creatorcontrib><creatorcontrib>Ferranti, Laura</creatorcontrib><creatorcontrib>Alonso-Mora, Javier</creatorcontrib><creatorcontrib>Stachniss, Cyrill</creatorcontrib><creatorcontrib>Fridovich-Keil, David</creatorcontrib><title>Online and offline learning of player objectives from partial observations in dynamic games</title><title>The International journal of robotics research</title><description>Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.</description><subject>Collision avoidance</subject><subject>Coupling</subject><subject>Distance learning</subject><subject>Estimates</subject><subject>Game theory</subject><subject>Players</subject><issn>0278-3649</issn><issn>1741-3176</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><recordid>eNp1UEtLw0AQXkTBWv0B3hY8p-4zmxyl-IJCL3ryEKbJbNmS7MbdtJB_b2oFD-Jphm--x_ARcsvZgnNj7pkwhcxVKSTnhVBanpEZN4pnkpv8nMyO9-xIuCRXKe0YYzJn5Yx8rH3rPFLwDQ3Wfu8tQvTObyeA9i2MGGnY7LAe3AETtTF0tIc4OGgnPGE8wOCCT9R52oweOlfTLXSYrsmFhTbhzc-ck_enx7flS7ZaP78uH1ZZLbUeMlVuSg2KidpaYS3nqHVpRWGnB4VsANVGYSkbrXkNGkDljVUMRSMNAoKWc3J38u1j-NxjGqpd2Ec_RVaiKKTQImdmYvETq44hpYi26qPrII4VZ9Wxw-pPh5NmcdIk2OKv6_-CL1XzckM</recordid><startdate>202309</startdate><enddate>202309</enddate><creator>Peters, Lasse</creator><creator>Rubies-Royo, Vicenç</creator><creator>Tomlin, Claire J</creator><creator>Ferranti, Laura</creator><creator>Alonso-Mora, Javier</creator><creator>Stachniss, Cyrill</creator><creator>Fridovich-Keil, David</creator><general>SAGE Publications</general><general>SAGE PUBLICATIONS, INC</general><scope>AFRWT</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-9008-7127</orcidid></search><sort><creationdate>202309</creationdate><title>Online and offline learning of player objectives from partial observations in dynamic games</title><author>Peters, Lasse ; Rubies-Royo, Vicenç ; Tomlin, Claire J ; Ferranti, Laura ; Alonso-Mora, Javier ; Stachniss, Cyrill ; Fridovich-Keil, David</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Collision avoidance</topic><topic>Coupling</topic><topic>Distance learning</topic><topic>Estimates</topic><topic>Game theory</topic><topic>Players</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Peters, Lasse</creatorcontrib><creatorcontrib>Rubies-Royo, Vicenç</creatorcontrib><creatorcontrib>Tomlin, Claire J</creatorcontrib><creatorcontrib>Ferranti, Laura</creatorcontrib><creatorcontrib>Alonso-Mora, Javier</creatorcontrib><creatorcontrib>Stachniss, Cyrill</creatorcontrib><creatorcontrib>Fridovich-Keil, David</creatorcontrib><collection>SAGE Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The International journal of robotics research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Peters, Lasse</au><au>Rubies-Royo, Vicenç</au><au>Tomlin, Claire J</au><au>Ferranti, Laura</au><au>Alonso-Mora, Javier</au><au>Stachniss, Cyrill</au><au>Fridovich-Keil, David</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online and offline learning of player objectives from partial observations in dynamic games</atitle><jtitle>The International journal of robotics research</jtitle><date>2023-09</date><risdate>2023</risdate><volume>42</volume><issue>10</issue><spage>917</spage><epage>937</epage><pages>917-937</pages><issn>0278-3649</issn><eissn>1741-3176</eissn><abstract>Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/02783649231182453</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0001-9008-7127</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0278-3649 |
ispartof | The International journal of robotics research, 2023-09, Vol.42 (10), p.917-937 |
issn | 0278-3649 1741-3176 |
language | eng |
recordid | cdi_proquest_journals_2883252607 |
source | Sage Journals Online |
subjects | Collision avoidance Coupling Distance learning Estimates Game theory Players |
title | Online and offline learning of player objectives from partial observations in dynamic games |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T16%3A56%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20and%20offline%20learning%20of%20player%20objectives%20from%20partial%20observations%20in%20dynamic%20games&rft.jtitle=The%20International%20journal%20of%20robotics%20research&rft.au=Peters,%20Lasse&rft.date=2023-09&rft.volume=42&rft.issue=10&rft.spage=917&rft.epage=937&rft.pages=917-937&rft.issn=0278-3649&rft.eissn=1741-3176&rft_id=info:doi/10.1177/02783649231182453&rft_dat=%3Cproquest_cross%3E2883252607%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2883252607&rft_id=info:pmid/&rft_sage_id=10.1177_02783649231182453&rfr_iscdi=true |