Loading…

More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch

For humans, the process of grasping an object relies heavily on rich tactile feedback. Most recent robotic grasping work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this letter, we investigate how a robot can learn to use t...

Full description

Saved in:
Bibliographic Details
Published in:IEEE robotics and automation letters 2018-10, Vol.3 (4), p.3300-3307
Main Authors: Calandra, Roberto, Owens, Andrew, Jayaraman, Dinesh, Lin, Justin, Wenzhen Yuan, Malik, Jitendra, Adelson, Edward H., Levine, Sergey
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3
cites cdi_FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3
container_end_page 3307
container_issue 4
container_start_page 3300
container_title IEEE robotics and automation letters
container_volume 3
creator Calandra, Roberto
Owens, Andrew
Jayaraman, Dinesh
Lin, Justin
Wenzhen Yuan
Malik, Jitendra
Adelson, Edward H.
Levine, Sergey
description For humans, the process of grasping an object relies heavily on rich tactile feedback. Most recent robotic grasping work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this letter, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to-end action-conditional model that learns regrasping policies from raw visuo-tactile data. This model - a deep, multimodal convolutional network - predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions. Our approach requires neither calibration of the tactile sensors nor any analytical modeling of contact forces, thus reducing the engineering effort required to obtain efficient grasping policies. We train our model with data from about 6450 grasping trials on a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger. Across extensive experiments, our approach outperforms a variety of baselines at 1) estimating grasp adjustment outcomes, 2) selecting efficient grasp adjustments for quick grasping, and 3) reducing the amount of force applied at the fingers, while maintaining competitive performance. Finally, we study the choices made by our model and show that it has successfully acquired useful and interpretable grasping behaviors.
doi_str_mv 10.1109/LRA.2018.2852779
format article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8403291</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8403291</ieee_id><sourcerecordid>2298337632</sourcerecordid><originalsourceid>FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3</originalsourceid><addsrcrecordid>eNpNkM1rAjEQxUNpoWK9F3oJ9Lx2kuxukt5EqhW2FER7DcnurK7YXZvoof-98YPS0zx4vzfDPEIeGQwZA_1SzEdDDkwNucq4lPqG9LiQMhEyz2__6XsyCGEDACxiQmc9MvvoPNLF2rbU0gnitmlXr7RA69uo6L6jU2_Djtq2onNcnfUynKyvJjRdezYW3aFcP5C72m4DDq6zT5aTt8X4PSk-p7PxqEjKFNJ9kmtXacmzukJIMyFKcApSJ6SrnFbcQQ4OdYpVpVJVY8Yyycq8hFroHKx1ok-eL3t3vvs5YNibTXfwbTxpONdKxD8FjxRcqNJ3IXiszc4339b_Ggbm1JmJnZlTZ-baWYw8XSINIv7hKgXBNRNHXKFllg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2298337632</pqid></control><display><type>article</type><title>More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Calandra, Roberto ; Owens, Andrew ; Jayaraman, Dinesh ; Lin, Justin ; Wenzhen Yuan ; Malik, Jitendra ; Adelson, Edward H. ; Levine, Sergey</creator><creatorcontrib>Calandra, Roberto ; Owens, Andrew ; Jayaraman, Dinesh ; Lin, Justin ; Wenzhen Yuan ; Malik, Jitendra ; Adelson, Edward H. ; Levine, Sergey</creatorcontrib><description>For humans, the process of grasping an object relies heavily on rich tactile feedback. Most recent robotic grasping work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this letter, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to-end action-conditional model that learns regrasping policies from raw visuo-tactile data. This model - a deep, multimodal convolutional network - predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions. Our approach requires neither calibration of the tactile sensors nor any analytical modeling of contact forces, thus reducing the engineering effort required to obtain efficient grasping policies. We train our model with data from about 6450 grasping trials on a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger. Across extensive experiments, our approach outperforms a variety of baselines at 1) estimating grasp adjustment outcomes, 2) selecting efficient grasp adjustments for quick grasping, and 3) reducing the amount of force applied at the fingers, while maintaining competitive performance. Finally, we study the choices made by our model and show that it has successfully acquired useful and interpretable grasping behaviors.</description><identifier>ISSN: 2377-3766</identifier><identifier>EISSN: 2377-3766</identifier><identifier>DOI: 10.1109/LRA.2018.2852779</identifier><identifier>CODEN: IRALC6</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Analytical models ; Contact force ; Deep learning in robotics and automation ; Fingers ; Force ; force and tactile sensing ; Grasping ; Grasping (robotics) ; perception for grasping and manipulation ; Policies ; Sensory feedback ; Tactile sensors ; Tactile sensors (robotics)</subject><ispartof>IEEE robotics and automation letters, 2018-10, Vol.3 (4), p.3300-3307</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3</citedby><cites>FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3</cites><orcidid>0000-0001-9430-8433 ; 0000-0002-6888-3095</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8403291$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Calandra, Roberto</creatorcontrib><creatorcontrib>Owens, Andrew</creatorcontrib><creatorcontrib>Jayaraman, Dinesh</creatorcontrib><creatorcontrib>Lin, Justin</creatorcontrib><creatorcontrib>Wenzhen Yuan</creatorcontrib><creatorcontrib>Malik, Jitendra</creatorcontrib><creatorcontrib>Adelson, Edward H.</creatorcontrib><creatorcontrib>Levine, Sergey</creatorcontrib><title>More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch</title><title>IEEE robotics and automation letters</title><addtitle>LRA</addtitle><description>For humans, the process of grasping an object relies heavily on rich tactile feedback. Most recent robotic grasping work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this letter, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to-end action-conditional model that learns regrasping policies from raw visuo-tactile data. This model - a deep, multimodal convolutional network - predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions. Our approach requires neither calibration of the tactile sensors nor any analytical modeling of contact forces, thus reducing the engineering effort required to obtain efficient grasping policies. We train our model with data from about 6450 grasping trials on a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger. Across extensive experiments, our approach outperforms a variety of baselines at 1) estimating grasp adjustment outcomes, 2) selecting efficient grasp adjustments for quick grasping, and 3) reducing the amount of force applied at the fingers, while maintaining competitive performance. Finally, we study the choices made by our model and show that it has successfully acquired useful and interpretable grasping behaviors.</description><subject>Analytical models</subject><subject>Contact force</subject><subject>Deep learning in robotics and automation</subject><subject>Fingers</subject><subject>Force</subject><subject>force and tactile sensing</subject><subject>Grasping</subject><subject>Grasping (robotics)</subject><subject>perception for grasping and manipulation</subject><subject>Policies</subject><subject>Sensory feedback</subject><subject>Tactile sensors</subject><subject>Tactile sensors (robotics)</subject><issn>2377-3766</issn><issn>2377-3766</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNpNkM1rAjEQxUNpoWK9F3oJ9Lx2kuxukt5EqhW2FER7DcnurK7YXZvoof-98YPS0zx4vzfDPEIeGQwZA_1SzEdDDkwNucq4lPqG9LiQMhEyz2__6XsyCGEDACxiQmc9MvvoPNLF2rbU0gnitmlXr7RA69uo6L6jU2_Djtq2onNcnfUynKyvJjRdezYW3aFcP5C72m4DDq6zT5aTt8X4PSk-p7PxqEjKFNJ9kmtXacmzukJIMyFKcApSJ6SrnFbcQQ4OdYpVpVJVY8Yyycq8hFroHKx1ok-eL3t3vvs5YNibTXfwbTxpONdKxD8FjxRcqNJ3IXiszc4339b_Ggbm1JmJnZlTZ-baWYw8XSINIv7hKgXBNRNHXKFllg</recordid><startdate>20181001</startdate><enddate>20181001</enddate><creator>Calandra, Roberto</creator><creator>Owens, Andrew</creator><creator>Jayaraman, Dinesh</creator><creator>Lin, Justin</creator><creator>Wenzhen Yuan</creator><creator>Malik, Jitendra</creator><creator>Adelson, Edward H.</creator><creator>Levine, Sergey</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-9430-8433</orcidid><orcidid>https://orcid.org/0000-0002-6888-3095</orcidid></search><sort><creationdate>20181001</creationdate><title>More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch</title><author>Calandra, Roberto ; Owens, Andrew ; Jayaraman, Dinesh ; Lin, Justin ; Wenzhen Yuan ; Malik, Jitendra ; Adelson, Edward H. ; Levine, Sergey</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Analytical models</topic><topic>Contact force</topic><topic>Deep learning in robotics and automation</topic><topic>Fingers</topic><topic>Force</topic><topic>force and tactile sensing</topic><topic>Grasping</topic><topic>Grasping (robotics)</topic><topic>perception for grasping and manipulation</topic><topic>Policies</topic><topic>Sensory feedback</topic><topic>Tactile sensors</topic><topic>Tactile sensors (robotics)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Calandra, Roberto</creatorcontrib><creatorcontrib>Owens, Andrew</creatorcontrib><creatorcontrib>Jayaraman, Dinesh</creatorcontrib><creatorcontrib>Lin, Justin</creatorcontrib><creatorcontrib>Wenzhen Yuan</creatorcontrib><creatorcontrib>Malik, Jitendra</creatorcontrib><creatorcontrib>Adelson, Edward H.</creatorcontrib><creatorcontrib>Levine, Sergey</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE robotics and automation letters</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Calandra, Roberto</au><au>Owens, Andrew</au><au>Jayaraman, Dinesh</au><au>Lin, Justin</au><au>Wenzhen Yuan</au><au>Malik, Jitendra</au><au>Adelson, Edward H.</au><au>Levine, Sergey</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch</atitle><jtitle>IEEE robotics and automation letters</jtitle><stitle>LRA</stitle><date>2018-10-01</date><risdate>2018</risdate><volume>3</volume><issue>4</issue><spage>3300</spage><epage>3307</epage><pages>3300-3307</pages><issn>2377-3766</issn><eissn>2377-3766</eissn><coden>IRALC6</coden><abstract>For humans, the process of grasping an object relies heavily on rich tactile feedback. Most recent robotic grasping work, however, has been based only on visual input, and thus cannot easily benefit from feedback after initiating contact. In this letter, we investigate how a robot can learn to use tactile information to iteratively and efficiently adjust its grasp. To this end, we propose an end-to-end action-conditional model that learns regrasping policies from raw visuo-tactile data. This model - a deep, multimodal convolutional network - predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions. Our approach requires neither calibration of the tactile sensors nor any analytical modeling of contact forces, thus reducing the engineering effort required to obtain efficient grasping policies. We train our model with data from about 6450 grasping trials on a two-finger gripper equipped with GelSight high-resolution tactile sensors on each finger. Across extensive experiments, our approach outperforms a variety of baselines at 1) estimating grasp adjustment outcomes, 2) selecting efficient grasp adjustments for quick grasping, and 3) reducing the amount of force applied at the fingers, while maintaining competitive performance. Finally, we study the choices made by our model and show that it has successfully acquired useful and interpretable grasping behaviors.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/LRA.2018.2852779</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-9430-8433</orcidid><orcidid>https://orcid.org/0000-0002-6888-3095</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2377-3766
ispartof IEEE robotics and automation letters, 2018-10, Vol.3 (4), p.3300-3307
issn 2377-3766
2377-3766
language eng
recordid cdi_ieee_primary_8403291
source IEEE Electronic Library (IEL) Journals
subjects Analytical models
Contact force
Deep learning in robotics and automation
Fingers
Force
force and tactile sensing
Grasping
Grasping (robotics)
perception for grasping and manipulation
Policies
Sensory feedback
Tactile sensors
Tactile sensors (robotics)
title More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T09%3A57%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=More%20Than%20a%20Feeling:%20Learning%20to%20Grasp%20and%20Regrasp%20Using%20Vision%20and%20Touch&rft.jtitle=IEEE%20robotics%20and%20automation%20letters&rft.au=Calandra,%20Roberto&rft.date=2018-10-01&rft.volume=3&rft.issue=4&rft.spage=3300&rft.epage=3307&rft.pages=3300-3307&rft.issn=2377-3766&rft.eissn=2377-3766&rft.coden=IRALC6&rft_id=info:doi/10.1109/LRA.2018.2852779&rft_dat=%3Cproquest_ieee_%3E2298337632%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c404t-69bd9725fde04533c0b804b37bdb982b060be94edd848fe51571c6c0f3960aab3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2298337632&rft_id=info:pmid/&rft_ieee_id=8403291&rfr_iscdi=true