Loading…
A semi-supervised multi-label classification framework with feature reduction and enrichment
Multi-label classification (MLC) has drawn much attention, thanks to its usefulness and omnipresence in real-world applications in which objects may be characterized by more than one label as in the traditional approach. Getting multi-label examples is costly and time-consuming; therefore, semi-supe...
Saved in:
Published in: | Journal of information and telecommunication (Print) 2017-04, Vol.1 (2), p.141-154 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Multi-label classification (MLC) has drawn much attention, thanks to its usefulness and omnipresence in real-world applications in which objects may be characterized by more than one label as in the traditional approach. Getting multi-label examples is costly and time-consuming; therefore, semi-supervised learning approach should be considered to take advantages of both labelled and unlabelled data. In this work, we propose a semi-supervised MLC algorithm exploiting the specific features of the prominent class label(s) chosen by a greedy approach as an extension of the LIFT algorithm, and unlabelled data consumption mechanism from the TExt classification using Semi-supervised Clustering algorithm. We also make a semi-supervised MLC application framework for Vietnamese texts with several feature enrichment steps including (a) a stage of enriching features by adding hidden topic features and (b) a stage of dimensional reduction for subtracting irrelevant features. Experimental results on a data set of hotel reviews (for tourism) indicate that a reasonable amount of unlabelled data helps to increase the F1 score. Interestingly, with a small amount of labelled data, our algorithm can reach a comparative performance to the case of using a larger amount of labelled data. |
---|---|
ISSN: | 2475-1839 2475-1847 |
DOI: | 10.1080/24751839.2017.1323486 |