Loading…

Adaptive RiskAware Bidding with Budget Constraint in Display Advertising

Real-time bidding (RTB) has become a major paradigm of display advertising. Each ad impression generated from a user visit is auctioned in real time, where demand-side plat- form (DSP) automatically provides bid price usually relying on the ad impression value estimation and the optimal bid price de...

Full description

Saved in:
Bibliographic Details
Published in:SIGKDD explorations 2023-06, Vol.25 (1), p.73-82
Main Authors: Jiang, Zhimeng, Zhou, Kaixiong, Zhang, Mi, Chen, Rui, Hu, Xia, Choi, SooHyun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Real-time bidding (RTB) has become a major paradigm of display advertising. Each ad impression generated from a user visit is auctioned in real time, where demand-side plat- form (DSP) automatically provides bid price usually relying on the ad impression value estimation and the optimal bid price determination. However, the current bid strategy over- looks the randomness of the user behaviors (e.g., click) and the cost uncertainty caused by the auction competition. In this work, we propose a novel adaptive risk-aware bidding algorithm with budget constraint via reinforcement learn- ing, which is the rst to simultaneously consider estimation uncertainty and the dynamic risk tendency of a DSP. Specif- ically, we explicitly factor in the uncertainty of estimated ad impression values and model the risk preference of a DSP under a speci c state and market environment via a sequen- tial decision process. Additionally, we theoretically unveil the intrinsic relation between the uncertainty and the risk tendency based on value at risk (VaR). Consequently, we propose two instantiations to model risk tendency, includ- ing an expert knowledge-based formulation embracing three essential properties and an adaptive learning method based on self-supervised reinforcement learning. We conduct ex- periments on public datasets and show that the proposed framework achieves better performance in terms of the num- ber of clicks under di erent budget constraints 1.
ISSN:1931-0145
1931-0153
DOI:10.1145/3606274.3606281