Loading…

K nearest neighbours with mutual information for simultaneous classification and missing data imputation

Missing data is a common drawback in many real-life pattern classification scenarios. One of the most popular solutions is missing data imputation by the K nearest neighbours ( K NN ) algorithm. In this article, we propose a novel K NN imputation procedure using a feature-weighted distance metric ba...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2009-03, Vol.72 (7), p.1483-1493
Main Authors: García-Laencina, Pedro J., Sancho-Gómez, José-Luis, Figueiras-Vidal, Aníbal R., Verleysen, Michel
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Missing data is a common drawback in many real-life pattern classification scenarios. One of the most popular solutions is missing data imputation by the K nearest neighbours ( K NN ) algorithm. In this article, we propose a novel K NN imputation procedure using a feature-weighted distance metric based on mutual information (MI). This method provides a missing data estimation aimed at solving the classification task, i.e., it provides an imputed dataset which is directed toward improving the classification performance. The MI-based distance metric is also used to implement an effective K NN classifier. Experimental results on both artificial and real classification datasets are provided to illustrate the efficiency and the robustness of the proposed algorithm.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2008.11.026