Loading…

Uncertainty Aware System Identification with Universal Policies

Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing m...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-02
Main Authors: Semage, Buddhika Laknath, Thommen, George Karimpanal, Rana, Santu, Venkatesh, Svetha
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Semage, Buddhika Laknath
Thommen, George Karimpanal
Rana, Santu
Venkatesh, Svetha
description Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining relevant UPN policies in a DR like fashion. Such policy-driven grounding is expected to be more efficient as it estimates only task-relevant sets of parameters. Further, we also account for the estimation uncertainties in the search process to produce policies that are robust against both aleatoric and epistemic uncertainties. We empirically evaluate our approach in a range of noisy, continuous control environments, and show its improved performance compared to competing baselines.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2628909324</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2628909324</sourcerecordid><originalsourceid>FETCH-proquest_journals_26289093243</originalsourceid><addsrcrecordid>eNqNyrEKwjAUQNEgCBbtPwScC_Glre0kIopugnYuIb7iKzXRJLX073XwA5zucO6ERSDlKilSgBmLvW-FEJCvIctkxDaV0eiCIhNGvh2UQ34ZfcAHP93QBGpIq0DW8IHCnVeG3ui86vjZdqQJ_YJNG9V5jH-ds-Vhf90dk6ezrx59qFvbO_OlGnIoSlFKSOV_1wc9zjlA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2628909324</pqid></control><display><type>article</type><title>Uncertainty Aware System Identification with Universal Policies</title><source>Publicly Available Content (ProQuest)</source><creator>Semage, Buddhika Laknath ; Thommen, George Karimpanal ; Rana, Santu ; Venkatesh, Svetha</creator><creatorcontrib>Semage, Buddhika Laknath ; Thommen, George Karimpanal ; Rana, Santu ; Venkatesh, Svetha</creatorcontrib><description>Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining relevant UPN policies in a DR like fashion. Such policy-driven grounding is expected to be more efficient as it estimates only task-relevant sets of parameters. Further, we also account for the estimation uncertainties in the search process to produce policies that are robust against both aleatoric and epistemic uncertainties. We empirically evaluate our approach in a range of noisy, continuous control environments, and show its improved performance compared to competing baselines.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Estimation ; Optimization ; Parameter identification ; Policies ; Robustness ; Search process ; Simulation ; System identification ; Uncertainty</subject><ispartof>arXiv.org, 2022-02</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2628909324?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25731,36989,44566</link.rule.ids></links><search><creatorcontrib>Semage, Buddhika Laknath</creatorcontrib><creatorcontrib>Thommen, George Karimpanal</creatorcontrib><creatorcontrib>Rana, Santu</creatorcontrib><creatorcontrib>Venkatesh, Svetha</creatorcontrib><title>Uncertainty Aware System Identification with Universal Policies</title><title>arXiv.org</title><description>Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining relevant UPN policies in a DR like fashion. Such policy-driven grounding is expected to be more efficient as it estimates only task-relevant sets of parameters. Further, we also account for the estimation uncertainties in the search process to produce policies that are robust against both aleatoric and epistemic uncertainties. We empirically evaluate our approach in a range of noisy, continuous control environments, and show its improved performance compared to competing baselines.</description><subject>Estimation</subject><subject>Optimization</subject><subject>Parameter identification</subject><subject>Policies</subject><subject>Robustness</subject><subject>Search process</subject><subject>Simulation</subject><subject>System identification</subject><subject>Uncertainty</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNyrEKwjAUQNEgCBbtPwScC_Glre0kIopugnYuIb7iKzXRJLX073XwA5zucO6ERSDlKilSgBmLvW-FEJCvIctkxDaV0eiCIhNGvh2UQ34ZfcAHP93QBGpIq0DW8IHCnVeG3ui86vjZdqQJ_YJNG9V5jH-ds-Vhf90dk6ezrx59qFvbO_OlGnIoSlFKSOV_1wc9zjlA</recordid><startdate>20220211</startdate><enddate>20220211</enddate><creator>Semage, Buddhika Laknath</creator><creator>Thommen, George Karimpanal</creator><creator>Rana, Santu</creator><creator>Venkatesh, Svetha</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220211</creationdate><title>Uncertainty Aware System Identification with Universal Policies</title><author>Semage, Buddhika Laknath ; Thommen, George Karimpanal ; Rana, Santu ; Venkatesh, Svetha</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_26289093243</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Estimation</topic><topic>Optimization</topic><topic>Parameter identification</topic><topic>Policies</topic><topic>Robustness</topic><topic>Search process</topic><topic>Simulation</topic><topic>System identification</topic><topic>Uncertainty</topic><toplevel>online_resources</toplevel><creatorcontrib>Semage, Buddhika Laknath</creatorcontrib><creatorcontrib>Thommen, George Karimpanal</creatorcontrib><creatorcontrib>Rana, Santu</creatorcontrib><creatorcontrib>Venkatesh, Svetha</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Semage, Buddhika Laknath</au><au>Thommen, George Karimpanal</au><au>Rana, Santu</au><au>Venkatesh, Svetha</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Uncertainty Aware System Identification with Universal Policies</atitle><jtitle>arXiv.org</jtitle><date>2022-02-11</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Sim2real transfer is primarily concerned with transferring policies trained in simulation to potentially noisy real world environments. A common problem associated with sim2real transfer is estimating the real-world environmental parameters to ground the simulated environment to. Although existing methods such as Domain Randomisation (DR) can produce robust policies by sampling from a distribution of parameters during training, there is no established method for identifying the parameters of the corresponding distribution for a given real-world setting. In this work, we propose Uncertainty-aware policy search (UncAPS), where we use Universal Policy Network (UPN) to store simulation-trained task-specific policies across the full range of environmental parameters and then subsequently employ robust Bayesian optimisation to craft robust policies for the given environment by combining relevant UPN policies in a DR like fashion. Such policy-driven grounding is expected to be more efficient as it estimates only task-relevant sets of parameters. Further, we also account for the estimation uncertainties in the search process to produce policies that are robust against both aleatoric and epistemic uncertainties. We empirically evaluate our approach in a range of noisy, continuous control environments, and show its improved performance compared to competing baselines.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2022-02
issn 2331-8422
language eng
recordid cdi_proquest_journals_2628909324
source Publicly Available Content (ProQuest)
subjects Estimation
Optimization
Parameter identification
Policies
Robustness
Search process
Simulation
System identification
Uncertainty
title Uncertainty Aware System Identification with Universal Policies
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-23T18%3A05%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Uncertainty%20Aware%20System%20Identification%20with%20Universal%20Policies&rft.jtitle=arXiv.org&rft.au=Semage,%20Buddhika%20Laknath&rft.date=2022-02-11&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2628909324%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_26289093243%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2628909324&rft_id=info:pmid/&rfr_iscdi=true