Loading…
Acquisition of deterministic exploration and purposive memory through reinforcement learning with a recurrent neural network
The authors have propounded that various functions emerge purposively and harmoniously through reinforcement learning with a neural network. In this paper, emergence of deterministic "exploration" behavior, which is different from the stochastic exploration and needs higher intelligence, i...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The authors have propounded that various functions emerge purposively and harmoniously through reinforcement learning with a neural network. In this paper, emergence of deterministic "exploration" behavior, which is different from the stochastic exploration and needs higher intelligence, is focused on. In order to realize the intelligent exploration behaviors, it becomes a key point whether the recurrent neural network memorizes necessary information and utilizes it to generate appropriate actions. In the simulation of 3 Ă— 3 grid world with an invisible goal task, by introducing a recurrent neural network for Q-learning, an agent can represent more accurate Q-values considering the past experiences, and that is suggested to enable to learn appropriate actions. The acquired knowledge can be generalized in some unknown environment to some extent. In another task in a simple environment with a random-located branch, it is also shown that the recurrent neural network cleverly memorizes and keeps the branch position to represent accurate Q-values after learning. |
---|