Loading…

SS4CTR: a semi-supervised framework for enhancing click-through rate prediction in sparse and imbalanced data

Click-Through Rate (CTR) prediction, which estimates the probability of a user clicking on a particular item, constitutes a pivotal element in the realms of both online advertising and recommender systems. However, issues surrounding sparse and imbalanced data have yet to be resolved. To cope with t...

Full description

Saved in:
Bibliographic Details
Published in:World wide web (Bussum) 2024-11, Vol.27 (6), p.72, Article 72
Main Authors: Zhou, Junming, Chang, Chao, Li, Weisheng, Lin, Ronghua, Wu, Zhengyang, Tang, Yong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Click-Through Rate (CTR) prediction, which estimates the probability of a user clicking on a particular item, constitutes a pivotal element in the realms of both online advertising and recommender systems. However, issues surrounding sparse and imbalanced data have yet to be resolved. To cope with these challenges, this paper proposes a semi-supervised framework called SS4CTR. Two distinctive features characterise the proposed SS4CTR model. Firstly, it employs an interpretable approach to select negative samples based on the global popularity of items, ensuring a balanced ratio of positive and negative samples within the input dataset. Secondly, by integrating both labeled and unlabeled data into the training process, the model effectively tackles the challenge of data sparsity and significantly enhances the accuracy of user click-through rate predictions. And the confidence threshold mechanism for pseudo-labelling also ensures that unlabeled data can be used in a secure manner. To the best of our knowledge, this is the first study to address the key challenges posed by sparse and imbalanced data simultaneously in the context of CTR prediction. Extensive experiments conducted on four real-world sparse datasets confirm the effectiveness and applicability of the SS4CTR model in scenarios characterized by sparse and imbalanced data.
ISSN:1386-145X
1573-1413
DOI:10.1007/s11280-024-01310-2