Loading…

Online and offline learning of player objectives from partial observations in dynamic games

Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such tec...

Full description

Saved in:
Bibliographic Details
Published in:The International journal of robotics research 2023-09, Vol.42 (10), p.917-937
Main Authors: Peters, Lasse, Rubies-Royo, Vicenç, Tomlin, Claire J, Ferranti, Laura, Alonso-Mora, Javier, Stachniss, Cyrill, Fridovich-Keil, David
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53
cites cdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53
container_end_page 937
container_issue 10
container_start_page 917
container_title The International journal of robotics research
container_volume 42
creator Peters, Lasse
Rubies-Royo, Vicenç
Tomlin, Claire J
Ferranti, Laura
Alonso-Mora, Javier
Stachniss, Cyrill
Fridovich-Keil, David
description Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.
doi_str_mv 10.1177/02783649231182453
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2883252607</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_02783649231182453</sage_id><sourcerecordid>2883252607</sourcerecordid><originalsourceid>FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</originalsourceid><addsrcrecordid>eNp1UEtLw0AQXkTBWv0B3hY8p-4zmxyl-IJCL3ryEKbJbNmS7MbdtJB_b2oFD-Jphm--x_ARcsvZgnNj7pkwhcxVKSTnhVBanpEZN4pnkpv8nMyO9-xIuCRXKe0YYzJn5Yx8rH3rPFLwDQ3Wfu8tQvTObyeA9i2MGGnY7LAe3AETtTF0tIc4OGgnPGE8wOCCT9R52oweOlfTLXSYrsmFhTbhzc-ck_enx7flS7ZaP78uH1ZZLbUeMlVuSg2KidpaYS3nqHVpRWGnB4VsANVGYSkbrXkNGkDljVUMRSMNAoKWc3J38u1j-NxjGqpd2Ec_RVaiKKTQImdmYvETq44hpYi26qPrII4VZ9Wxw-pPh5NmcdIk2OKv6_-CL1XzckM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2883252607</pqid></control><display><type>article</type><title>Online and offline learning of player objectives from partial observations in dynamic games</title><source>Sage Journals Online</source><creator>Peters, Lasse ; Rubies-Royo, Vicenç ; Tomlin, Claire J ; Ferranti, Laura ; Alonso-Mora, Javier ; Stachniss, Cyrill ; Fridovich-Keil, David</creator><creatorcontrib>Peters, Lasse ; Rubies-Royo, Vicenç ; Tomlin, Claire J ; Ferranti, Laura ; Alonso-Mora, Javier ; Stachniss, Cyrill ; Fridovich-Keil, David</creatorcontrib><description>Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.</description><identifier>ISSN: 0278-3649</identifier><identifier>EISSN: 1741-3176</identifier><identifier>DOI: 10.1177/02783649231182453</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Collision avoidance ; Coupling ; Distance learning ; Estimates ; Game theory ; Players</subject><ispartof>The International journal of robotics research, 2023-09, Vol.42 (10), p.917-937</ispartof><rights>The Author(s) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</citedby><cites>FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</cites><orcidid>0000-0001-9008-7127</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904,79111</link.rule.ids></links><search><creatorcontrib>Peters, Lasse</creatorcontrib><creatorcontrib>Rubies-Royo, Vicenç</creatorcontrib><creatorcontrib>Tomlin, Claire J</creatorcontrib><creatorcontrib>Ferranti, Laura</creatorcontrib><creatorcontrib>Alonso-Mora, Javier</creatorcontrib><creatorcontrib>Stachniss, Cyrill</creatorcontrib><creatorcontrib>Fridovich-Keil, David</creatorcontrib><title>Online and offline learning of player objectives from partial observations in dynamic games</title><title>The International journal of robotics research</title><description>Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.</description><subject>Collision avoidance</subject><subject>Coupling</subject><subject>Distance learning</subject><subject>Estimates</subject><subject>Game theory</subject><subject>Players</subject><issn>0278-3649</issn><issn>1741-3176</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><recordid>eNp1UEtLw0AQXkTBWv0B3hY8p-4zmxyl-IJCL3ryEKbJbNmS7MbdtJB_b2oFD-Jphm--x_ARcsvZgnNj7pkwhcxVKSTnhVBanpEZN4pnkpv8nMyO9-xIuCRXKe0YYzJn5Yx8rH3rPFLwDQ3Wfu8tQvTObyeA9i2MGGnY7LAe3AETtTF0tIc4OGgnPGE8wOCCT9R52oweOlfTLXSYrsmFhTbhzc-ck_enx7flS7ZaP78uH1ZZLbUeMlVuSg2KidpaYS3nqHVpRWGnB4VsANVGYSkbrXkNGkDljVUMRSMNAoKWc3J38u1j-NxjGqpd2Ec_RVaiKKTQImdmYvETq44hpYi26qPrII4VZ9Wxw-pPh5NmcdIk2OKv6_-CL1XzckM</recordid><startdate>202309</startdate><enddate>202309</enddate><creator>Peters, Lasse</creator><creator>Rubies-Royo, Vicenç</creator><creator>Tomlin, Claire J</creator><creator>Ferranti, Laura</creator><creator>Alonso-Mora, Javier</creator><creator>Stachniss, Cyrill</creator><creator>Fridovich-Keil, David</creator><general>SAGE Publications</general><general>SAGE PUBLICATIONS, INC</general><scope>AFRWT</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-9008-7127</orcidid></search><sort><creationdate>202309</creationdate><title>Online and offline learning of player objectives from partial observations in dynamic games</title><author>Peters, Lasse ; Rubies-Royo, Vicenç ; Tomlin, Claire J ; Ferranti, Laura ; Alonso-Mora, Javier ; Stachniss, Cyrill ; Fridovich-Keil, David</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Collision avoidance</topic><topic>Coupling</topic><topic>Distance learning</topic><topic>Estimates</topic><topic>Game theory</topic><topic>Players</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Peters, Lasse</creatorcontrib><creatorcontrib>Rubies-Royo, Vicenç</creatorcontrib><creatorcontrib>Tomlin, Claire J</creatorcontrib><creatorcontrib>Ferranti, Laura</creatorcontrib><creatorcontrib>Alonso-Mora, Javier</creatorcontrib><creatorcontrib>Stachniss, Cyrill</creatorcontrib><creatorcontrib>Fridovich-Keil, David</creatorcontrib><collection>SAGE Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The International journal of robotics research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Peters, Lasse</au><au>Rubies-Royo, Vicenç</au><au>Tomlin, Claire J</au><au>Ferranti, Laura</au><au>Alonso-Mora, Javier</au><au>Stachniss, Cyrill</au><au>Fridovich-Keil, David</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Online and offline learning of player objectives from partial observations in dynamic games</atitle><jtitle>The International journal of robotics research</jtitle><date>2023-09</date><risdate>2023</risdate><volume>42</volume><issue>10</issue><spage>917</spage><epage>937</epage><pages>917-937</pages><issn>0278-3649</issn><eissn>1741-3176</eissn><abstract>Robots deployed to the real world must be able to interact with other agents in their environment. Dynamic game theory provides a powerful mathematical framework for modeling scenarios in which agents have individual objectives and interactions evolve over time. However, a key limitation of such techniques is that they require a priori knowledge of all players’ objectives. In this work, we address this issue by proposing a novel method for learning players’ objectives in continuous dynamic games from noise-corrupted, partial state observations. Our approach learns objectives by coupling the estimation of unknown cost parameters of each player with inference of unobserved states and inputs through Nash equilibrium constraints. By coupling past state estimates with future state predictions, our approach is amenable to simultaneous online learning and prediction in receding horizon fashion. We demonstrate our method in several simulated traffic scenarios in which we recover players’ preferences, for, e.g. desired travel speed and collision-avoidance behavior. Results show that our method reliably estimates game-theoretic models from noise-corrupted data that closely matches ground-truth objectives, consistently outperforming state-of-the-art approaches.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/02783649231182453</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0001-9008-7127</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0278-3649
ispartof The International journal of robotics research, 2023-09, Vol.42 (10), p.917-937
issn 0278-3649
1741-3176
language eng
recordid cdi_proquest_journals_2883252607
source Sage Journals Online
subjects Collision avoidance
Coupling
Distance learning
Estimates
Game theory
Players
title Online and offline learning of player objectives from partial observations in dynamic games
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T16%3A56%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Online%20and%20offline%20learning%20of%20player%20objectives%20from%20partial%20observations%20in%20dynamic%20games&rft.jtitle=The%20International%20journal%20of%20robotics%20research&rft.au=Peters,%20Lasse&rft.date=2023-09&rft.volume=42&rft.issue=10&rft.spage=917&rft.epage=937&rft.pages=917-937&rft.issn=0278-3649&rft.eissn=1741-3176&rft_id=info:doi/10.1177/02783649231182453&rft_dat=%3Cproquest_cross%3E2883252607%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c355t-49b95a402cff2ff11e559f28f60923dae4b4e93d551ca5aa46df40e2d37eaea53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2883252607&rft_id=info:pmid/&rft_sage_id=10.1177_02783649231182453&rfr_iscdi=true