Loading…

Neighborhood preserving D-optimal design for active learning and its application to terrain classification

In many real-world applications, labeled data are usually expensive to get, while there may be a large amount of unlabeled data. To reduce the labeling cost, active learning attempts to discover the most informative data points for labeling. The challenge is which unlabeled samples should be labeled...

Full description

Saved in:

Bibliographic Details
Published in:	Neural computing & applications 2013-12, Vol.23 (7-8), p.2085-2092
Main Authors:	Gu, Yingjie, Jin, Zhong
Format:	Article
Language:	English
Subjects:	Applied sciences Artificial Intelligence Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Computer science control theory systems Data Mining and Knowledge Discovery Data processing. List processing. Character string processing Earth sciences Earth, ocean, space Exact sciences and technology Geophysics: general, magnetic, electric and thermic methods and properties Image Processing and Computer Vision Internal geophysics Memory organisation. Data processing Original Article Pattern recognition. Digital image processing. Computational geometry Probability and Statistics in Computer Science Software
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In many real-world applications, labeled data are usually expensive to get, while there may be a large amount of unlabeled data. To reduce the labeling cost, active learning attempts to discover the most informative data points for labeling. The challenge is which unlabeled samples should be labeled to improve the classifier the most. Classical optimal experimental design algorithms are based on least-square errors over the labeled samples only while the unlabeled points are ignored. In this paper, we propose a novel active learning algorithm called neighborhood preserving D-optimal design. Our algorithm is based on a neighborhood preserving regression model which simultaneously minimizes the least-square error on the measured samples and preserves the neighborhood structure of the data space. It selects the most informative samples which minimize the variance of the regression parameter. We also extend our algorithm to nonlinear case by using kernel trick. Experimental results on terrain classification show the effectiveness of proposed approach.
ISSN:	0941-0643 1433-3058
DOI:	10.1007/s00521-012-1155-3