Loading…

AutoEval: Are Labels Always Necessary for Classifier Accuracy Evaluation?

Understanding model decision under novel test scenarios is central to the community. A common practice is evaluating models on labeled test sets. However, many real-world scenarios see unlabeled test data, rendering the common supervised evaluation protocols infeasible. In this paper, we investigate...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence 2024-03, Vol.46 (3), p.1868-1880
Main Authors:	Deng, Weijian, Zheng, Liang
Format:	Article
Language:	English
Subjects:	Accuracy accuracy estimation Automatic model evaluation Classifiers Correlation dataset-level regression Datasets Image color analysis meta-dataset Model accuracy Neural networks Pearson distributions Predictive models Regression models Robustness (mathematics) Statistical tests Task analysis Test sets Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Understanding model decision under novel test scenarios is central to the community. A common practice is evaluating models on labeled test sets. However, many real-world scenarios see unlabeled test data, rendering the common supervised evaluation protocols infeasible. In this paper, we investigate such an important but under-explored problem, named Automatic model Evaluation (AutoEval). Specifically, given a trained classifier, we aim to estimate its accuracy on various unlabeled test datasets. We construct a meta-dataset: a dataset comprised of datasets (sample sets) created from original images via various transformations such as rotation and background substitution. Correlation studies on the meta-dataset show that classifier accuracy exhibits a strong negative linear relationship with distribution shift (Pearson's Correlation r
ISSN:	0162-8828 1939-3539 2160-9292
DOI:	10.1109/TPAMI.2021.3136244