Loading…

From Clicks to Conversions: Recommendation for long-term reward

Recommender systems are often optimised for short-term reward: a recommendation is considered successful if a reward (e.g. a click) can be observed immediately after the recommendation. The advantage of this framework is that with some reasonable (although questionable) assumptions, it allows famili...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2020-09
Main Authors: Chagniot, Philomène, Vasile, Flavian, Rohde, David
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recommender systems are often optimised for short-term reward: a recommendation is considered successful if a reward (e.g. a click) can be observed immediately after the recommendation. The advantage of this framework is that with some reasonable (although questionable) assumptions, it allows familiar supervised learning tools to be used for the recommendation task. However, it means that long-term business metrics, e.g. sales or retention are ignored. In this paper we introduce a framework for modeling long-term rewards in the RecoGym simulation environment. We use this newly introduced functionality to showcase problems introduced by the last-click attribution scheme in the case of conversion-optimized recommendations and propose a simple extension that leads to state-of-the-art results.
ISSN:2331-8422