Loading…

ON SIMULTANEOUS CALIBRATION OF TWO-SAMPLE t-TESTS FOR HIGH-DIMENSION LOW-SAMPLE-SIZE DATA

The exact distribution is typically unavailable for a two-sample t-statistic in a single test for equal population means if we have nonGaussian samples, unequal population variances, or unequal sample sizes n 1 and n 2. In this case, a calibration method using a reference distribution offers a pract...

Full description

Saved in:
Bibliographic Details
Published in:Statistica Sinica 2021-07, Vol.31 (3), p.1189-1214
Main Authors: Zhang, Chunming, Jia, Shengji, Wu, Yongfeng
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The exact distribution is typically unavailable for a two-sample t-statistic in a single test for equal population means if we have nonGaussian samples, unequal population variances, or unequal sample sizes n 1 and n 2. In this case, a calibration method using a reference distribution offers a practically feasible substitute. This study simultaneously calibrates a diverging number m of two-sample t-statistics for inferences of significance in high-dimensional data from a small sample. For the Gaussian calibration method, we demonstrate the following. First, the simultaneous “general” two-sample t-statistics achieve the overall significance level, as long as log(m) increases at a strictly slower rate than (n 1 + n 2)1/3 as n 1 + n 2 diverges. Second, directly applying the same calibration method to simultaneous “pooled” two-sample t-statistics may substantially lose the overall level accuracy. The proposed “adaptively pooled” two-sample t-statistics overcome such incoherence, while operating as simply and performing as well as the “general” two-sample t-statistics. Third, we propose a “two-stage” t-test procedure to effectively alleviate the skewness commonly encountered in various two-sample t-statistics in practice, thus increasing the calibration accuracy. Lastly, we discuss the implications of these results using simulation studies and real-data applications.
ISSN:1017-0405
1996-8507
DOI:10.5705/ss.202018.0467