Loading…

Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal

•CADF deals with multiple diversity sources to balance accuracy and interpretability.•To enhance diversity, CADF merges different correlated-adjusted decision trees.•Results suggest CADF can compete in accuracy with much more complex ensemble models.•Superior accuracy of CADF is tested through diffe...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2015-08, Vol.42 (13), p.5737-5753
Main Authors: Florez-Lopez, Raquel, Ramon-Jeronimo, Juan Manuel
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•CADF deals with multiple diversity sources to balance accuracy and interpretability.•To enhance diversity, CADF merges different correlated-adjusted decision trees.•Results suggest CADF can compete in accuracy with much more complex ensemble models.•Superior accuracy of CADF is tested through different measures and statistical tests.•Oppositely to ‘black-box’ models, CADF produces logical, human understanding rules. Credit risk assessment is a critical topic for finance activity and bankruptcy prediction that has been broadly explored using statistical models and Machine Learning methods. Recently, studies have suggested the use of ensemble strategies to enhance credit modelling performance. However, accuracy is obtained at the expense of interpretability, leading to the reluctance of financial industry to employ ensemble models in favour of simpler models. In this work we introduce an ensemble approach based on merged decision trees, the correlated-adjusted decision forest (CADF), to produce both accurate and comprehensible models. As main innovation, our proposal explores the combination of complementary sources of diversity as mechanisms to optimise model’s structure, which leads to a manageable number of comprehensive decision rules without sacrificing performance. We evaluate our approach in comparison to individual classifiers and alternative ensemble strategies (gradient boosting, random forests). Empirical results suggest CADF is an encouraging solution for credit risk problems, being able to compete in accuracy with much complex proposals while producing a rule-based structure directly useful for managerial decisions.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2015.02.042