Loading…

On high dimensional two-sample tests based on nearest neighbors

In this article, we propose new multivariate two-sample tests based on nearest neighbor type coincidences. While several existing tests for the multivariate two-sample problem perform poorly for high dimensional data, and many of them are not applicable when the dimension exceeds the sample size, th...

Full description

Saved in:
Bibliographic Details
Published in:Journal of multivariate analysis 2015-10, Vol.141, p.168-178
Main Authors: Mondal, Pronoy K., Biswas, Munmun, Ghosh, Anil K.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this article, we propose new multivariate two-sample tests based on nearest neighbor type coincidences. While several existing tests for the multivariate two-sample problem perform poorly for high dimensional data, and many of them are not applicable when the dimension exceeds the sample size, these proposed tests can be conveniently used in the high dimension low sample size (HDLSS) situations. Unlike Schilling (1986) [26] and Henze’s (1988) test based on nearest neighbors, under fairly general conditions, these new tests are found to be consistent in HDLSS asymptotic regime, where the sample size remains fixed and the dimension grows to infinity. Several high dimensional simulated and real data sets are analyzed to compare their empirical performance with some popular two-sample tests available in the literature. We further investigate the behavior of these proposed tests in classical asymptotic regime, where the dimension of the data remains fixed and the sample size tends to infinity. In such cases, they turn out to be asymptotically distribution-free and consistent under general alternatives.
ISSN:0047-259X
1095-7243
DOI:10.1016/j.jmva.2015.07.002