Loading…
An applicability index for reliable and applicable decision trees in water quality modelling
Data-driven environmental models are mainly assessed on the basis of their model fit and only limited attention is given to their applicability for end-users. In this paper, we present the applicability index (API) that scores decision trees in terms of their interpretability and applicability for e...
Saved in:
Published in: | Ecological informatics 2016-03, Vol.32, p.1-6 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Data-driven environmental models are mainly assessed on the basis of their model fit and only limited attention is given to their applicability for end-users. In this paper, we present the applicability index (API) that scores decision trees in terms of their interpretability and applicability for end-users. The API integrates two criteria, viz. the simplicity of the model and its ability to predict the classes of the response variable. We developed 10,000 decision trees with different parameterizations and assessed the use of API for model selection with two different datasets. The API reduced the number of decision trees that were retained only based on statistical criteria from 2,806 to 173 and from 1,117 to 784, respectively. The models that were retained were more easily interpretable, equally statistically reliable but less complex. Conventional statistical criteria such as Cohen’s kappa and the number of correctly classified instances were only moderately correlated with the API (r=0.26 and r=0.49, respectively). This indicates that the API is a useful complement to the existing statistical criteria available for model selection. The API was tested for two datasets consisting of water quality data in lowland rivers in Belgium and the Netherlands, hence its validity needs to be tested for other types of data and modelling domains.
•The applicability index (API) quantifies the applicability of decision trees.•To do so, the API integrates the simplicity submetric and the binary submetric.•The value of the API is illustrated on the basis of two water quality datasets.•The API is a complement to existing statistical criteria for model selection. |
---|---|
ISSN: | 1574-9541 |
DOI: | 10.1016/j.ecoinf.2015.12.004 |