Loading…
Learning by Watching via Keypoint Extraction and Imitation Learning
In recent years, the use of reinforcement learning and imitation learning to complete robot control tasks have become more popular. Demonstration and learning by experts have always been the goal of researchers. However, the lack of action data has been a significant limitation to learning by human...
Saved in:
Published in: | Machines (Basel) 2022-11, Vol.10 (11), p.1049 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In recent years, the use of reinforcement learning and imitation learning to complete robot control tasks have become more popular. Demonstration and learning by experts have always been the goal of researchers. However, the lack of action data has been a significant limitation to learning by human demonstration. We propose an architecture based on a new 3D keypoint tracking model and generative adversarial imitation learning to learn from expert demonstrations. We used 3D keypoint tracking to make up for the lack of action data in simple images and then used image-to-image conversion to convert human hand demonstrations into robot images, which enabled subsequent generative adversarial imitation learning to learn smoothly. The estimation time of the 3D keypoint tracking model and the calculation time of the subsequent optimization algorithm was 30 ms. The coordinate errors of the model projected to the real 3D key point under correct detection were all within 1.8 cm. The tracking of key points did not require any sensors on the body; the operator did not need vision-related knowledge to correct the accuracy of the camera. By merely setting up a generic depth camera to track the mapping changes of key points after behavior clone training, the robot could learn human tasks by watching, including picking and placing an object and pouring water. We used pybullet to build an experimental environment to confirm our concept of the simplest behavioral cloning imitation to attest the success of the learning. The effectiveness of the proposed method was accomplished by a satisfactory performance requiring a sample efficiency of 20 sets for pick and place and 30 sets for pouring water. |
---|---|
ISSN: | 2075-1702 2075-1702 |
DOI: | 10.3390/machines10111049 |