Loading…

A predictive deviance criterion for selecting a generative model in semi-supervised classification

Semi-supervised classification can help to improve generative classifiers by taking into account the information provided by the unlabeled data points, especially when there are far more unlabeled data than labeled data. The aim is to select a generative classification model using both unlabeled and...

Full description

Saved in:

Bibliographic Details
Published in:	Computational statistics & data analysis 2013-08, Vol.64, p.220-236
Main Authors:	Vandewalle, Vincent, Biernacki, Christophe, Celeux, Gilles, Govaert, Gérard
Format:	Article
Language:	English
Subjects:	AIC Algorithms BEC BIC Classification Classifiers Criteria Cross-validated error rate Data points EM algorithm Entropy Gaussian mixture models Generative models Information criteria Mathematical analysis Mathematical models Mathematics Maximum likelihood Statistics Statistics Theory
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Semi-supervised classification can help to improve generative classifiers by taking into account the information provided by the unlabeled data points, especially when there are far more unlabeled data than labeled data. The aim is to select a generative classification model using both unlabeled and labeled data. A predictive deviance criterion, AICcond, aiming to select a parsimonious and relevant generative classifier in the semi-supervised context is proposed. In contrast to standard information criteria such as AIC and BIC, AICcond is focused on the classification task, since it attempts to measure the predictive power of a generative model by approximating its predictive deviance. However, it avoids the computational cost of cross-validation criteria, which make repeated use of the EM algorithm. AICcond is proved to have consistency properties that ensure its parsimony when compared with the Bayesian Entropy Criterion (BEC), whose focus is similar to that of AICcond. Numerical experiments on both simulated and real data sets show that the behavior of AICcond as regards the selection of variables and models, is encouraging when it is compared to the competing criteria.
ISSN:	0167-9473 1872-7352
DOI:	10.1016/j.csda.2013.02.010