Loading…

Generalization Performance of Pure Accuracy and its Application in Selective Ensemble Learning

The pure accuracy measure is used to eliminate random consistency from the accuracy measure. Biases to both majority and minority classes in the pure accuracy are lower than that in the accuracy measure. In this paper, we demonstrate that compared with the accuracy measure and F-measure, the pure ac...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2023-02, Vol.45 (2), p.1798-1816
Main Authors: Wang, Jieting, Qian, Yuhua, Li, Feijiang, Liang, Jiye, Zhang, Qingfu
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The pure accuracy measure is used to eliminate random consistency from the accuracy measure. Biases to both majority and minority classes in the pure accuracy are lower than that in the accuracy measure. In this paper, we demonstrate that compared with the accuracy measure and F-measure, the pure accuracy measure is class distribution insensitive and discriminative for good classifiers. The advantages make the pure accuracy measure suitable for traditional classification. Further, we mainly focus on two points: exploring a tighter generalization bound on pure accuracy based learning paradigm and designing a learning algorithm based on the pure accuracy measure. Particularly, with the self-bounding property, we build an algorithm-independent generalization bound on the pure accuracy measure, which is tighter than the existing bound of an order O(1/\sqrt{N}) O(1/N) (N is the number of instances). The proposed bound is free from making a smoothness or convex assumption on the hypothesis functions. In addition, we design a learning algorithm optimizing the pure accuracy measure and use it in the selective ensemble learning setting. The experiments on sixteen benchmark data sets and four image data sets demonstrate that the proposed method statistically performs better than the other eight representative benchmark algorithms.
ISSN:0162-8828
1939-3539
2160-9292
DOI:10.1109/TPAMI.2022.3171436