Loading…

Large-Scale Interactive Recommendation With Tree-Structured Reinforcement Learning

Although reinforcement learning (RL) techniques are regarded as promising solutions for interactive recommender systems (IRS), such solutions still face three main challenges, namely, i) time inefficiency when handling large discrete action space in IRS, ii) inability to deal with the cold-start sce...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2023-04, Vol.35 (4), p.4018-4032
Main Authors: Chen, Haokun, Zhu, Chenxu, Tang, Ruiming, Zhang, Weinan, He, Xiuqiang, Yu, Yong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Although reinforcement learning (RL) techniques are regarded as promising solutions for interactive recommender systems (IRS), such solutions still face three main challenges, namely, i) time inefficiency when handling large discrete action space in IRS, ii) inability to deal with the cold-start scenarios in IRS, iii) data inefficiency during training the RL-based methods. To tackle these challenges, we propose a generic tree-structured RL framework taking both policy-based and value-based approaches into consideration. We propose to construct a balanced tree over representations of the items, such that picking an item is formulated as seeking a suitable path from the root to a leaf node in the balanced tree, which dramatically reduces the time complexity of item recommendation. Further, for cold-start scenarios where prior information of the items is unavailable, we initialize a random balanced tree as the starting point and then refine the tree structure based on the learned item representations. Besides, we also incorporate a user modeling component to explicitly model the environment, which can be utilized in the training phase to improve data efficiency. Extensive experiments on two real-world datasets are conducted and demonstrate that our framework can achieve superior recommendation performance and provide time and data efficiency improvement over state-of-the-art methods in both warm-start and cold-start IRS scenarios.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2021.3137310