Loading…
Model-Based Transfer Reinforcement Learning Based on Graphical Model Representations
Reinforcement learning (RL) plays an essential role in the field of artificial intelligence but suffers from data inefficiency and model-shift issues. One possible solution to deal with such issues is to exploit transfer learning. However, interpretability problems and negative transfer may occur wi...
Saved in:
Published in: | IEEE transaction on neural networks and learning systems 2023-02, Vol.34 (2), p.1035-1048 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Reinforcement learning (RL) plays an essential role in the field of artificial intelligence but suffers from data inefficiency and model-shift issues. One possible solution to deal with such issues is to exploit transfer learning. However, interpretability problems and negative transfer may occur without explainable models. In this article, we define Relation Transfer as explainable and transferable learning based on graphical model representations, inferring the skeleton and relations among variables in a causal view and generalizing to the target domain. The proposed algorithm consists of the following three steps. First, we leverage a suitable casual discovery method to identify the causal graph based on the augmented source domain data. After that, we make inferences on the target model based on the prior causal knowledge. Finally, offline RL training on the target model is utilized as prior knowledge to improve the policy training in the target domain. The proposed method can answer the question of what to transfer and realize zero-shot transfer across related domains in a principled way. To demonstrate the robustness of the proposed framework, we conduct experiments on four classical control problems as well as one simulation to the real-world application. Experimental results on both continuous and discrete cases demonstrate the efficacy of the proposed method. |
---|---|
ISSN: | 2162-237X 2162-2388 |
DOI: | 10.1109/TNNLS.2021.3107375 |