Loading…

CAGE: A Curiosity-Driven Graph-Based Explore-Exploit Algorithm for Solving Deterministic Environment MDPs With Limited Episode Problem

The explore-exploit dilemma in Markov Decision Processes (MDPs) is a fundamental challenge, especially in deterministic environments akin to real-world scenarios. Balancing exploration and exploitation within limited episodes is crucial to optimize decision-making. Despite existing research, challen...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024, Vol.12, p.144106-144121
Main Authors: Yu, Yide, Liu, Yue, Wong, Dennis, Li, Huijie, Egas-Lopez, Jose Vicente, Ma, Yan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The explore-exploit dilemma in Markov Decision Processes (MDPs) is a fundamental challenge, especially in deterministic environments akin to real-world scenarios. Balancing exploration and exploitation within limited episodes is crucial to optimize decision-making. Despite existing research, challenges like parameter sensitivity, lack of global optimality, and inefficient exploration of low-value regions remain. We introduce the Curiosity-driven Algorithm based on Graph for Exploration (CAGE), which addresses these issues through a graph-based framework. CAGE includes two variants: CAGE-greedy, ensuring optimal solutions with ample episodes, and CAGE-centrality, prioritizing significant states in limited episodes. Key contributions include eliminating parameter sensitivity, guaranteeing global optimality, and enhancing exploration efficiency. To validate the performance of the CAGE algorithm series, we design a grid world experiment. The experimental results demonstrate that the CAGE algorithm outperforms a comparative algorithm, indicating its feasibility for implementation in the industry and its high level of explainability. Experimental results validate CAGE's effectiveness in complex environments.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3468027