Loading…

Acquisition of deterministic exploration and purposive memory through reinforcement learning with a recurrent neural network

The authors have propounded that various functions emerge purposively and harmoniously through reinforcement learning with a neural network. In this paper, emergence of deterministic "exploration" behavior, which is different from the stochastic exploration and needs higher intelligence, i...

Full description

Saved in:
Bibliographic Details
Main Authors: Goto, Kenta, Shibata, Katsunari
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The authors have propounded that various functions emerge purposively and harmoniously through reinforcement learning with a neural network. In this paper, emergence of deterministic "exploration" behavior, which is different from the stochastic exploration and needs higher intelligence, is focused on. In order to realize the intelligent exploration behaviors, it becomes a key point whether the recurrent neural network memorizes necessary information and utilizes it to generate appropriate actions. In the simulation of 3 Ă— 3 grid world with an invisible goal task, by introducing a recurrent neural network for Q-learning, an agent can represent more accurate Q-values considering the past experiences, and that is suggested to enable to learn appropriate actions. The acquired knowledge can be generalized in some unknown environment to some extent. In another task in a simple environment with a random-located branch, it is also shown that the recurrent neural network cleverly memorizes and keeps the branch position to represent accurate Q-values after learning.