Loading…
Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings
Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and th...
Saved in:
Published in: | Sensors (Basel, Switzerland) Switzerland), 2024-01, Vol.24 (1), p.185 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c474t-ca7443dfdfc6f2e311fc7501151b3d935a00b234af164f932efa600a4454e7f63 |
container_end_page | |
container_issue | 1 |
container_start_page | 185 |
container_title | Sensors (Basel, Switzerland) |
container_volume | 24 |
creator | Sumanasena, Vidura Fernando, Heshan De Silva, Daswin Thileepan, Beniel Pasan, Amila Samarawickrama, Jayathu Osipov, Evgeny Alahakoon, Damminda |
description | Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings. |
doi_str_mv | 10.3390/s24010185 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_830b7d31c4ac4399be05fc3442e3df79</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A779351300</galeid><doaj_id>oai_doaj_org_article_830b7d31c4ac4399be05fc3442e3df79</doaj_id><sourcerecordid>A779351300</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-ca7443dfdfc6f2e311fc7501151b3d935a00b234af164f932efa600a4454e7f63</originalsourceid><addsrcrecordid>eNpVkk1vEzEQhlcIREvhwB9Aljhx2DL2eOPdE4rSQiNFgMrH1fJ67cUhsYvttOq_x2RL1MgHWzPv-3hmNFX1msI5YgfvE-NAgbbNk-qUcsbrljF4-uh9Ur1IaQ3AELF9Xp1gywCBi9Pq95WKw52Khlxa67QzPpMLF43O5GvYOH1PlluXVXbBk5VR0Ts_EhsiuQ59yE6Tz-rWjVPeeXJtUthFbepF8ClH5bwZyDeTc7Gll9UzqzbJvHq4z6ofHy-_L67q1ZdPy8V8VWsueK61EpzjYAerZ5YZpNRq0QClDe1x6LBRAD1DriydcdshM1bNABTnDTfCzvCsWk7cIai1vIluq-K9DMrJfSDEUapYat8Y2SL0YkCqudIcu6430FiNnJd_Byu6wqonVrozN7v-iHbhfs73tE3eSQrYNm3Rf5j0Rbw1gy7zjGpzZDvOePdLjuG2-EVLWSMK4e0DIYY_O5OyXJeR-jIwyTrKRMs7gKI6n1SjKm04b0Oh6XIGs3U6eGNdic9FaaGhuDe8mww6hpSisYeaKMh_WyQPW1S0bx43cVD-Xxv8C7Pbwpk</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2912784900</pqid></control><display><type>article</type><title>Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings</title><source>Open Access: PubMed Central</source><source>Publicly Available Content Database</source><creator>Sumanasena, Vidura ; Fernando, Heshan ; De Silva, Daswin ; Thileepan, Beniel ; Pasan, Amila ; Samarawickrama, Jayathu ; Osipov, Evgeny ; Alahakoon, Damminda</creator><creatorcontrib>Sumanasena, Vidura ; Fernando, Heshan ; De Silva, Daswin ; Thileepan, Beniel ; Pasan, Amila ; Samarawickrama, Jayathu ; Osipov, Evgeny ; Alahakoon, Damminda</creatorcontrib><description>Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.</description><identifier>ISSN: 1424-8220</identifier><identifier>EISSN: 1424-8220</identifier><identifier>DOI: 10.3390/s24010185</identifier><identifier>PMID: 38203047</identifier><language>eng</language><publisher>Switzerland: MDPI AG</publisher><subject>Algorithms ; autonomous navigation ; Behavior ; Comparative analysis ; Decision making ; Deep learning ; Dependable Communication and Computation Systems ; direct policy learning ; imitation learning ; Kommunikations- och beräkningssystem ; mobile robots ; Neural networks ; Robotics ; Robots ; Semantics ; Simulation ; Simulation methods ; Vehicles</subject><ispartof>Sensors (Basel, Switzerland), 2024-01, Vol.24 (1), p.185</ispartof><rights>COPYRIGHT 2023 MDPI AG</rights><rights>2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2023 by the authors. 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c474t-ca7443dfdfc6f2e311fc7501151b3d935a00b234af164f932efa600a4454e7f63</cites><orcidid>0009-0006-6891-3553 ; 0000-0003-3878-5969</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2912784900/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2912784900?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38203047$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-103858$$DView record from Swedish Publication Index$$Hfree_for_read</backlink></links><search><creatorcontrib>Sumanasena, Vidura</creatorcontrib><creatorcontrib>Fernando, Heshan</creatorcontrib><creatorcontrib>De Silva, Daswin</creatorcontrib><creatorcontrib>Thileepan, Beniel</creatorcontrib><creatorcontrib>Pasan, Amila</creatorcontrib><creatorcontrib>Samarawickrama, Jayathu</creatorcontrib><creatorcontrib>Osipov, Evgeny</creatorcontrib><creatorcontrib>Alahakoon, Damminda</creatorcontrib><title>Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings</title><title>Sensors (Basel, Switzerland)</title><addtitle>Sensors (Basel)</addtitle><description>Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.</description><subject>Algorithms</subject><subject>autonomous navigation</subject><subject>Behavior</subject><subject>Comparative analysis</subject><subject>Decision making</subject><subject>Deep learning</subject><subject>Dependable Communication and Computation Systems</subject><subject>direct policy learning</subject><subject>imitation learning</subject><subject>Kommunikations- och beräkningssystem</subject><subject>mobile robots</subject><subject>Neural networks</subject><subject>Robotics</subject><subject>Robots</subject><subject>Semantics</subject><subject>Simulation</subject><subject>Simulation methods</subject><subject>Vehicles</subject><issn>1424-8220</issn><issn>1424-8220</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpVkk1vEzEQhlcIREvhwB9Aljhx2DL2eOPdE4rSQiNFgMrH1fJ67cUhsYvttOq_x2RL1MgHWzPv-3hmNFX1msI5YgfvE-NAgbbNk-qUcsbrljF4-uh9Ur1IaQ3AELF9Xp1gywCBi9Pq95WKw52Khlxa67QzPpMLF43O5GvYOH1PlluXVXbBk5VR0Ts_EhsiuQ59yE6Tz-rWjVPeeXJtUthFbepF8ClH5bwZyDeTc7Gll9UzqzbJvHq4z6ofHy-_L67q1ZdPy8V8VWsueK61EpzjYAerZ5YZpNRq0QClDe1x6LBRAD1DriydcdshM1bNABTnDTfCzvCsWk7cIai1vIluq-K9DMrJfSDEUapYat8Y2SL0YkCqudIcu6430FiNnJd_Byu6wqonVrozN7v-iHbhfs73tE3eSQrYNm3Rf5j0Rbw1gy7zjGpzZDvOePdLjuG2-EVLWSMK4e0DIYY_O5OyXJeR-jIwyTrKRMs7gKI6n1SjKm04b0Oh6XIGs3U6eGNdic9FaaGhuDe8mww6hpSisYeaKMh_WyQPW1S0bx43cVD-Xxv8C7Pbwpk</recordid><startdate>20240101</startdate><enddate>20240101</enddate><creator>Sumanasena, Vidura</creator><creator>Fernando, Heshan</creator><creator>De Silva, Daswin</creator><creator>Thileepan, Beniel</creator><creator>Pasan, Amila</creator><creator>Samarawickrama, Jayathu</creator><creator>Osipov, Evgeny</creator><creator>Alahakoon, Damminda</creator><general>MDPI AG</general><general>MDPI</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>5PM</scope><scope>ADTPV</scope><scope>AOWAS</scope><scope>D8T</scope><scope>ZZAVC</scope><scope>DOA</scope><orcidid>https://orcid.org/0009-0006-6891-3553</orcidid><orcidid>https://orcid.org/0000-0003-3878-5969</orcidid></search><sort><creationdate>20240101</creationdate><title>Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings</title><author>Sumanasena, Vidura ; Fernando, Heshan ; De Silva, Daswin ; Thileepan, Beniel ; Pasan, Amila ; Samarawickrama, Jayathu ; Osipov, Evgeny ; Alahakoon, Damminda</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-ca7443dfdfc6f2e311fc7501151b3d935a00b234af164f932efa600a4454e7f63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>autonomous navigation</topic><topic>Behavior</topic><topic>Comparative analysis</topic><topic>Decision making</topic><topic>Deep learning</topic><topic>Dependable Communication and Computation Systems</topic><topic>direct policy learning</topic><topic>imitation learning</topic><topic>Kommunikations- och beräkningssystem</topic><topic>mobile robots</topic><topic>Neural networks</topic><topic>Robotics</topic><topic>Robots</topic><topic>Semantics</topic><topic>Simulation</topic><topic>Simulation methods</topic><topic>Vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sumanasena, Vidura</creatorcontrib><creatorcontrib>Fernando, Heshan</creatorcontrib><creatorcontrib>De Silva, Daswin</creatorcontrib><creatorcontrib>Thileepan, Beniel</creatorcontrib><creatorcontrib>Pasan, Amila</creatorcontrib><creatorcontrib>Samarawickrama, Jayathu</creatorcontrib><creatorcontrib>Osipov, Evgeny</creatorcontrib><creatorcontrib>Alahakoon, Damminda</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>PubMed Central (Full Participant titles)</collection><collection>SwePub</collection><collection>SwePub Articles</collection><collection>SWEPUB Freely available online</collection><collection>SwePub Articles full text</collection><collection>Directory of Open Access Journals</collection><jtitle>Sensors (Basel, Switzerland)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sumanasena, Vidura</au><au>Fernando, Heshan</au><au>De Silva, Daswin</au><au>Thileepan, Beniel</au><au>Pasan, Amila</au><au>Samarawickrama, Jayathu</au><au>Osipov, Evgeny</au><au>Alahakoon, Damminda</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings</atitle><jtitle>Sensors (Basel, Switzerland)</jtitle><addtitle>Sensors (Basel)</addtitle><date>2024-01-01</date><risdate>2024</risdate><volume>24</volume><issue>1</issue><spage>185</spage><pages>185-</pages><issn>1424-8220</issn><eissn>1424-8220</eissn><abstract>Direct policy learning (DPL) is a widely used approach in imitation learning for time-efficient and effective convergence when training mobile robots. However, using DPL in real-world applications is not sufficiently explored due to the inherent challenges of mobilizing direct human expertise and the difficulty of measuring comparative performance. Furthermore, autonomous systems are often resource-constrained, thereby limiting the potential application and implementation of highly effective deep learning models. In this work, we present a lightweight DPL-based approach to train mobile robots in navigational tasks. We integrated a safety policy alongside the navigational policy to safeguard the robot and the environment. The approach was evaluated in simulations and real-world settings and compared with recent work in this space. The results of these experiments and the efficient transfer from simulations to real-world settings demonstrate that our approach has improved performance compared to its hardware-intensive counterparts. We show that using the proposed methodology, the training agent achieves closer performance to the expert within the first 15 training iterations in simulation and real-world settings.</abstract><cop>Switzerland</cop><pub>MDPI AG</pub><pmid>38203047</pmid><doi>10.3390/s24010185</doi><orcidid>https://orcid.org/0009-0006-6891-3553</orcidid><orcidid>https://orcid.org/0000-0003-3878-5969</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1424-8220 |
ispartof | Sensors (Basel, Switzerland), 2024-01, Vol.24 (1), p.185 |
issn | 1424-8220 1424-8220 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_830b7d31c4ac4399be05fc3442e3df79 |
source | Open Access: PubMed Central; Publicly Available Content Database |
subjects | Algorithms autonomous navigation Behavior Comparative analysis Decision making Deep learning Dependable Communication and Computation Systems direct policy learning imitation learning Kommunikations- och beräkningssystem mobile robots Neural networks Robotics Robots Semantics Simulation Simulation methods Vehicles |
title | Hardware Efficient Direct Policy Imitation Learning for Robotic Navigation in Resource-Constrained Settings |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T21%3A52%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hardware%20Efficient%20Direct%20Policy%20Imitation%20Learning%20for%20Robotic%20Navigation%20in%20Resource-Constrained%20Settings&rft.jtitle=Sensors%20(Basel,%20Switzerland)&rft.au=Sumanasena,%20Vidura&rft.date=2024-01-01&rft.volume=24&rft.issue=1&rft.spage=185&rft.pages=185-&rft.issn=1424-8220&rft.eissn=1424-8220&rft_id=info:doi/10.3390/s24010185&rft_dat=%3Cgale_doaj_%3EA779351300%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c474t-ca7443dfdfc6f2e311fc7501151b3d935a00b234af164f932efa600a4454e7f63%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2912784900&rft_id=info:pmid/38203047&rft_galeid=A779351300&rfr_iscdi=true |