Loading…

An applicability index for reliable and applicable decision trees in water quality modelling

Data-driven environmental models are mainly assessed on the basis of their model fit and only limited attention is given to their applicability for end-users. In this paper, we present the applicability index (API) that scores decision trees in terms of their interpretability and applicability for e...

Full description

Saved in:
Bibliographic Details
Published in:Ecological informatics 2016-03, Vol.32, p.1-6
Main Authors: Everaert, G., Bennetsen, E., Goethals, P.L.M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data-driven environmental models are mainly assessed on the basis of their model fit and only limited attention is given to their applicability for end-users. In this paper, we present the applicability index (API) that scores decision trees in terms of their interpretability and applicability for end-users. The API integrates two criteria, viz. the simplicity of the model and its ability to predict the classes of the response variable. We developed 10,000 decision trees with different parameterizations and assessed the use of API for model selection with two different datasets. The API reduced the number of decision trees that were retained only based on statistical criteria from 2,806 to 173 and from 1,117 to 784, respectively. The models that were retained were more easily interpretable, equally statistically reliable but less complex. Conventional statistical criteria such as Cohen’s kappa and the number of correctly classified instances were only moderately correlated with the API (r=0.26 and r=0.49, respectively). This indicates that the API is a useful complement to the existing statistical criteria available for model selection. The API was tested for two datasets consisting of water quality data in lowland rivers in Belgium and the Netherlands, hence its validity needs to be tested for other types of data and modelling domains. •The applicability index (API) quantifies the applicability of decision trees.•To do so, the API integrates the simplicity submetric and the binary submetric.•The value of the API is illustrated on the basis of two water quality datasets.•The API is a complement to existing statistical criteria for model selection.
ISSN:1574-9541
DOI:10.1016/j.ecoinf.2015.12.004