Loading…

Unsupervised stratification of cross-validation for accuracy estimation

The rapid development of new learning algorithms increases the need for improved accuracy estimation methods. Moreover, methods allowing the comparison of several different learning algorithms are important for the performance evaluation of new ones. In this paper we propose new accuracy estimation...

Full description

Saved in:
Bibliographic Details
Published in:Artificial intelligence 2000, Vol.116 (1), p.1-16
Main Authors: Diamantidis, N.A., Karlis, D., Giakoumakis, E.A.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The rapid development of new learning algorithms increases the need for improved accuracy estimation methods. Moreover, methods allowing the comparison of several different learning algorithms are important for the performance evaluation of new ones. In this paper we propose new accuracy estimation methods which are extensions of the k-fold cross-validation method. The methods proposed construct cross-validation folds deterministically instead of using the random sampling approach. The deterministic construction of folds is performed using unsupervised stratification by exploiting the distribution of instances in the instance space. Our methods are based either on the one-center approach or on clustering procedures. These methods attempt to construct more representative folds, therefore reducing the bias of the resulting estimator. At the same time, our methods allow direct comparisons between the performance of learning algorithms in different experiments, since no randomness is present. A simulation experiment examining the performance of the proposed methods is reported, depicting their behavior in a variety of situations. The new methods reduce mainly the bias of the estimator.
ISSN:0004-3702
1872-7921
DOI:10.1016/S0004-3702(99)00094-6