Loading…

Imbalanced customer churn classification using a new multi-strategy collaborative processing method

•Treat a problem of the imbalanced customer churn classification.•A new multi-strategy collaborative processing method IADASYN-FLCatBoost is proposed.•The traditional ADASYN algorithm is improved and an IADASYN algorithm is proposed.•The Focal Loss function is embedded into the CatBoost to form a ne...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2024-08, Vol.247, p.123251, Article 123251
Main Authors: Rao, Congjun, Xu, Yaling, Xiao, Xinping, Hu, Fuyan, Goh, Mark
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Treat a problem of the imbalanced customer churn classification.•A new multi-strategy collaborative processing method IADASYN-FLCatBoost is proposed.•The traditional ADASYN algorithm is improved and an IADASYN algorithm is proposed.•The Focal Loss function is embedded into the CatBoost to form a new FLCatBoost.•Empirical analysis shows the proposed method performs better than similar methods. The rapid advancement of big data and artificial intelligence heralds a dual-edged era of opportunities and challenges for the banking sector. Indeed, enhancing a model's capability to accurately classify imbalanced datasets represents a critical challenge within the field of customer churn prediction (CCP). In this paper, to address the challenges presented by the problem of imbalanced customer classification, a new multi-strategy collaborative processing method named IADASYN-FLCatBoost is proposed from dual perspectives: data and algorithm. At the data level, the traditional Adaptive Synthetic (ADASYN) sampling is improved, that is, the LOF (Local Outlier Factor) algorithm is introduced to eliminate outliers, and the classification features are specially processed to synthesize new minority class samples, thus an improved ADASYN (IADASYN) algorithm is obtained. At the algorithm level, the Focal Loss is embedded into the CatBoost ensemble learning framework to form a new Focal Loss-CatBoost (FLCatBoost) to make a focal-aware, cost-sensitive version of imbalanced customer churn prediction. Moreover, the empirical analysis is conducted in conjunction with the credit card customer dataset obtained from the Kaggle platform. The results of the staged comparison experiments show that the proposed method IADASYN-FLCatBoost in this paper shows the best prediction performance. Comparing the proposed method with 5 other imbalanced classification algorithms and 20 classifiers composed of classical sampling methods and ensemble learning algorithms, it is verified that the classification effect of the proposed method performs best, and the values of Recall, F1 score, G-mean and Area under Precision-Recall curve (AUPRC) have been significantly improved. In addition, further verification of the model also proves that the proposed method has certain generalizability and is still valid for other banks and customer churn datasets of other industries.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2024.123251