Loading…

An investigation of bankruptcy prediction in imbalanced datasets

Previous studies of bankruptcy prediction in imbalanced datasets analyze either the loss of prediction due to data imbalance issues or treatment methods for dealing with this issue. The current article presents a combined investigation of the degree of imbalance, loss of performance, and treatment m...

Full description

Saved in:
Bibliographic Details
Published in:Decision Support Systems 2018-08, Vol.112, p.111-124
Main Authors: Veganzones, David, Séverin, Eric
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Previous studies of bankruptcy prediction in imbalanced datasets analyze either the loss of prediction due to data imbalance issues or treatment methods for dealing with this issue. The current article presents a combined investigation of the degree of imbalance, loss of performance, and treatment methods. It determines which imbalanced class distributions jeopardize the performance of bankruptcy prediction methods and identifies the recovery capacities of treatment methods. The results show that an imbalanced distribution, in which the minority class represents 20%, significantly disturbs prediction performance. Furthermore, the support vector machine method is less sensitive than other prediction methods to imbalanced distributions, and sampling methods can recover a satisfactory portion of performance losses. Accordingly, this study provides a better understanding of the data imbalance issue in the field of corporate failure and serves as a methodological guide for designing bankruptcy prediction methods in imbalanced datasets. •An investigation of bankruptcy prediction in imbalanced datasets is proposed.•The prediction losses increase as the imbalanced proportion grows more severe.•Support Vector Machine method is less affected by imbalanced datasets than other prediction method.•SMOTE outperforms other sampling techniques for all type of prediction models and different training set sizes.
ISSN:0167-9236
1873-5797
DOI:10.1016/j.dss.2018.06.011