Loading…
Stochastic Curiosity Maximizing Exploration
Deep reinforcement learning (RL) is known as an emerging research trend in machine learning for autonomous systems. In real-world scenarios, the extrinsic rewards, acquired from the environment for learning an agent, are usually missing or extremely sparse. Such an issue of sparse reward constrains...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deep reinforcement learning (RL) is known as an emerging research trend in machine learning for autonomous systems. In real-world scenarios, the extrinsic rewards, acquired from the environment for learning an agent, are usually missing or extremely sparse. Such an issue of sparse reward constrains the learning capability of agent because the agent only updates the policy when the goal state is successfully attained. It is always challenging to implement an efficient exploration in RL algorithms. To tackle the sparse reward and inefficient exploration, the agent needs other helpful information to update its policy even when there is no interaction with the environment. This paper proposes the stochastic curiosity maximizing exploration (SCME), a learning strategy explored to allow the agent to act as human. We cope with the sparse reward problem by encouraging the agent to explore future diversity. To do so, a latent dynamic system is developed to acquire the latent states and latent actions to predict the variations in future conditions. The mutual information and the prediction error in the predicted states and actions are calculated as the intrinsic rewards. The agent based on SCME is therefore learned by maximizing these rewards to improve sample efficiency for exploration. The experiments on PyDial and Super Mario Bros show the benefits of the proposed SCME in dialogue system and computer game, respectively. |
---|---|
ISSN: | 2161-4407 |
DOI: | 10.1109/IJCNN48605.2020.9207295 |