Loading…

Combining supervised and unsupervised models via unconstrained probabilistic embedding

In this study, we consider an ensemble problem in which we combine outputs coming from models developed in the supervised and unsupervised modes. By jointly considering the grouping results coming from unsupervised models we aim to improve the classification accuracy of supervised model ensemble. He...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences 2014-02, Vol.257, p.101-114
Main Authors: Ao, Xiang, Luo, Ping, Ma, Xudong, Zhuang, Fuzhen, He, Qing, Shi, Zhongzhi, Shen, Zhiyong
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this study, we consider an ensemble problem in which we combine outputs coming from models developed in the supervised and unsupervised modes. By jointly considering the grouping results coming from unsupervised models we aim to improve the classification accuracy of supervised model ensemble. Here, we formulate the ensemble task as an Unconstrained Probabilistic Embedding (UPE) problem. Specifically, we assume both objects and classes/clusters have latent coordinates without constraints in a D-dimensional Euclidean space, and consider the mapping from the embedded space into the space of model results as a probabilistic generative process. A solution to this embedding can be obtained using the quasi-Newton method, which makes objects and classes/clusters with high co-occurrence weights are embedded close. Then, prediction is determined by taking the distances between the object and the classes in the embedded space. We demonstrate the benefits of this unconstrained embedding method by running extensive and systematic experiments on real-world datasets. Furthermore, we conduct experiments to investigate how the quality and the number of clustering models affect the performance of this ensemble method. We also show the robustness of the proposed model.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2013.08.048