Loading…
Variance Variation Criterion and Consistency in Estimating the Number of Significant Signals of High-dimensional PCA
In this paper, we propose a criterion based on the variance variation of the sample eigenvalues to correctly estimate the number of significant components in high-dimensional principal component analysis (PCA), and it corresponds to the number of significant eigenvalues of the covariance matrix for...
Saved in:
Published in: | Acta Mathematicae Applicatae Sinica 2022-07, Vol.38 (3), p.513-531 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we propose a criterion based on the variance variation of the sample eigenvalues to correctly estimate the number of significant components in high-dimensional principal component analysis (PCA), and it corresponds to the number of significant eigenvalues of the covariance matrix for
p
-dimensional variables. Using the random matrix theory, we derive that the consistent properties of the proposed criterion for the situations that the significant eigenvalues tend to infinity, as well as that the bounded significant population eigenvalues. Numerical simulation shows that the probability of estimator is correct by our variance variation criterion converges to 1 is faster than that by criterion of Passemier and Yao [Estimation of the number of spikes, possibly equal, in the high-dimensional case.
J. Multivariate Anal.
, (2014)](PYC), AIC and BIC under the finite fourth moment condition as the dominant population eigenvalues tend to infinity. Moreover, in the case of the maximum eigenvalue bounded, once the gap condition is satisfied, the rate of convergence to 1 is faster than that of PYC and AIC, especially the effect is better than AIC when the sample size is small. It is worth noting that the variance variation criterion significantly improves the accuracy of model selection compared with PYC and AIC when the random variable is a heavy-tailed distribution or finite fourth moment not exists. |
---|---|
ISSN: | 0168-9673 1618-3932 |
DOI: | 10.1007/s10255-022-1094-4 |