Loading…

Information Loss of the Mahalanobis Distance in High Dimensions: Application to Feature Selection

When an infinite training set is used, the Mahalanobis distance between a pattern measurement vector of dimensionality D and the center of the class it belongs to is distributed as a chi 2 with D degrees of freedom. However, the distribution of Mahalanobis distance becomes either Fisher or Beta depe...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2009-12, Vol.31 (12), p.2275-2281
Main Authors: Ververidis, D., Kotropoulos, C.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:When an infinite training set is used, the Mahalanobis distance between a pattern measurement vector of dimensionality D and the center of the class it belongs to is distributed as a chi 2 with D degrees of freedom. However, the distribution of Mahalanobis distance becomes either Fisher or Beta depending on whether cross validation or resubstitution is used for parameter estimation in finite training sets. The total variation between chi 2 and Fisher, as well as between chi 2 and Beta, allows us to measure the information loss in high dimensions. The information loss is exploited then to set a lower limit for the correct classification rate achieved by the Bayes classifier that is used in subset feature selection.
ISSN:0162-8828
1939-3539
DOI:10.1109/TPAMI.2009.84