Loading…

Applying hybrid machine learning algorithms to assess customer risk-adjusted revenue in the financial industry

A Peer-to-Peer (P2P) service is a decentralized platform that directly connects individuals, buyers (lenders) and sellers (investors) without the intermediation of a third party. In the P2P lending market, customer cash flows are undeniably linked to their financial risk of default. Thus, forecastin...

Full description

Saved in:
Bibliographic Details
Published in:Electronic commerce research and applications 2022-11, Vol.56, p.101202, Article 101202
Main Authors: Machado, Marcos R., Karray, Salma
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A Peer-to-Peer (P2P) service is a decentralized platform that directly connects individuals, buyers (lenders) and sellers (investors) without the intermediation of a third party. In the P2P lending market, customer cash flows are undeniably linked to their financial risk of default. Thus, forecasting customers’ Risk-Adjusted Revenue (RAR) value is one of the most critical issues in financial decision-making. With the emergence of big data, traditional forecasting methods cannot provide the high predictive power needed for such metrics. We propose a hybrid method by integrating the use of supervised and unsupervised Machine Learning (ML) algorithms to enhance the accuracy of predicting customer-adjusted risk metrics. Using a real P2P dataset from the Lending Club, containing over two million cases, we forecast customers’ risk-adjusted revenue by applying ML algorithms for the first time. These include individual methods such as gradient boosting and decision trees, and hybrid frameworks that group customers using a clustering algorithm (k-Means or Density-Based Spatial Clustering of Applications with Noise (DBSCAN)) prior to implementing the individual methods. We compare the efficiency (processing time and accuracy) of this hybrid approach with the performance of individual regressor-based models to predict RAR. Our results indicate high predictive power for many individual ML algorithms (R2 score over 90%). Further, in most cases, hybrid models outperform the individual ones in both predictive performance and processing time. Finally, the feature importance analysis in the best predictive frameworks helps identify the most influential factors in predicting customers’ RAR in the P2P lending market. •Machine Learning methods are applied to predict customers’ RAR in P2P lending.•Hybrid ML tools of supervised and unsupervised learning are used to predict RAR.•Hybrid ML models present better accuracy and processing time than individual methods.•Hybrid tools help identify relevant features for predicting RAR at the cluster level.
ISSN:1567-4223
1873-7846
DOI:10.1016/j.elerap.2022.101202