Loading…

Combination of Information in Labeled and Unlabeled Data via Evidence Theory

For classification with few labeled and massive unlabeled patterns, co-training, which uses information in labeled and unlabeled data to classify query patterns, is often employed to train classifiers in two distinct views. The classifiers teach each other by adding high-confidence unlabeled pattern...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on artificial intelligence 2024-05, Vol.5 (5), p.2179-2192
Main Author: Huang, Linqing
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:For classification with few labeled and massive unlabeled patterns, co-training, which uses information in labeled and unlabeled data to classify query patterns, is often employed to train classifiers in two distinct views. The classifiers teach each other by adding high-confidence unlabeled patterns to training dataset of the other view. Whereas, the direct adding often leads to some negative influence when retraining classifiers because some patterns with wrong predictions are added into training dataset. The wrong predictions must be considered for performance improvement. To this end, we present a method called Combination of Information in Labeled and Unlabeled (CILU) data based on evidence theory to effectively extract and fuse complementary knowledge in labeled and unlabeled data. In CILU, patterns are characterized by two distinct views, and the unlabeled patterns with high-confidence predictions are first added into the other view. We can train two classifiers by few labeled training data and high-confidence unlabeled patterns in each view. The classifiers are fused by evidence theory, and their weights which aim to reduce the harmful influence of wrong predictions are learnt by constructing an objection function on labeled data. There exist some complementary information between two distinct views, so the fused classifiers in two views are also combined. In order to extract more useful information in unlabeled data, semi-supervised Fuzzy C-mean clustering paradigm is also employed to yield clustering results. For a query pattern, the classification results and clustering results obtained by combined classifiers and clustering partition are integrated to make final class decision.
ISSN:2691-4581
2691-4581
DOI:10.1109/TAI.2023.3316194