Loading…

Integration of unsupervised and supervised machine learning algorithms for credit risk assessment

•The model helps improve efficiency and accuracy of credit risk assessment.•The proposed ensemble strategy combines SOM and seven supervised learning methods.•The ensemble strategy helps improve the performance of credit scoring models.•The proposed ensemble strategy was tested on three real world d...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2019-08, Vol.128, p.301-315
Main Authors: Bao, Wang, Lianju, Ning, Yue, Kong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•The model helps improve efficiency and accuracy of credit risk assessment.•The proposed ensemble strategy combines SOM and seven supervised learning methods.•The ensemble strategy helps improve the performance of credit scoring models.•The proposed ensemble strategy was tested on three real world datasets. For the sake of credit risk assessment, credit scoring has become a critical tool to discriminate “bad” applicants from “good” applicants for financial institutions. Accordingly, a wide range of supervised machine learning algorithms have been successfully applied to credit scoring; however, integration of unsupervised learning with supervised learning in this field has drawn little consideration. In this work, we propose a combination strategy of integrating unsupervised learning with supervised learning for credit risk assessment. The difference between our work and other previous work on unsupervised integration is that we apply unsupervised learning techniques at two different stages: the consensus stage and dataset clustering stage. Comparisons of model performance are performed based on three credit datasets in four groups: individual models, individual models + consensus model, clustering + individual models, clustering + individual models + consensus model. As a result, integration at either the consensus stage or dataset clustering stage is effective on improving the performance of credit scoring models. Moreover, the combination of the two stages achieves the best performance, thereby confirming the superiority of the proposed integration of unsupervised and supervised machine learning algorithms, which boost our confidence that this strategy can be extended to many other credit datasets from financial institutions.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2019.02.033