Loading…

Research on Knowledge Graph Completion Model Combining Temporal Convolutional Network and Monte Carlo Tree Search

In knowledge graph completion (KGC) and other applications, learning how to move from a source node to a target node with a given query is an important problem. It can be formulated as a reinforcement learning (RL) problem transition model under a given state. In order to overcome the challenges of...

Full description

Saved in:

Bibliographic Details
Published in:	Mathematical problems in engineering 2022-03, Vol.2022, p.1-13
Main Authors:	Wang, Ying, Sun, Mingchen, Wang, Hongji, Sun, Yudong
Format:	Article
Language:	English
Subjects:	Algorithms Decision making Deep learning History Knowledge Knowledge representation Machine learning Markov analysis Monte Carlo simulation Neural networks Q values Searching Teaching methods
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In knowledge graph completion (KGC) and other applications, learning how to move from a source node to a target node with a given query is an important problem. It can be formulated as a reinforcement learning (RL) problem transition model under a given state. In order to overcome the challenges of sparse rewards and historical state encoding, we develop a deep agent network (graph-agent, GA), which combines temporal convolutional network (TCN) and Monte Carlo Tree Search (MCTS). Firstly, we combine MCTS with neural network to generate more positive reward trajectories, which can effectively solve the problem of sparse rewards. TCN is used to encode the history state, which is used for policy and Q-value respectively. Secondly, according to these trajectories, we use Q-Learning to improve the network and parameter sharing to enhance TCN strategy. We apply these steps repeatedly to learn the model. Thirdly, in the prediction stage of the model, Monte Carlo Tree Search combined with Q-value method is used to predict the target nodes. The experimental results on several graph-walking benchmarks show that GA is better than other RL methods based on-policy gradient. The performance of GA is also better than the traditional KGC baselines.
ISSN:	1024-123X 1563-5147
DOI:	10.1155/2022/2290540