Tests for high-dimensional covariance matrices

2019 ◽  
Vol 09 (03) ◽  
pp. 2050009
Author(s):  
Jing Chen ◽  
Xiaoyi Wang ◽  
Shurong Zheng ◽  
Baisen Liu ◽  
Ning-Zhong Shi

In this paper, we propose some new tests for high-dimensional covariance matrices that are applicable to generally distributed populations with finite fourth moments. The proposed test statistics are the maximum of the likelihood ratio test statistic and the statistic based on the Frobenius norm. The advantage of the new tests is the good performance in terms of power for both the traditional case, in which the dimension is much smaller than the sample size, and the high-dimensional case, in which the dimension is large compared to the sample size. In the one-sample case, the new test is proposed for testing the hypothesis that the high-dimensional covariance matrix equals an identity matrix. In the two-sample case, the new test is developed for testing the equality of two high-dimensional covariance matrices. By using the random matrix theory, the asymptotic distributions of the proposed new tests are derived under the assumption that the dimension and the sample size proportionally tend toward infinity. Finally, numerical studies are conducted to investigate the finite sample performance of the proposed new tests.

Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 936
Author(s):  
Dan Wang

In this paper, a ratio test based on bootstrap approximation is proposed to detect the persistence change in heavy-tailed observations. This paper focuses on the symmetry testing problems of I(1)-to-I(0) and I(0)-to-I(1). On the basis of residual CUSUM, the test statistic is constructed in a ratio form. I prove the null distribution of the test statistic. The consistency under alternative hypothesis is also discussed. However, the null distribution of the test statistic contains an unknown tail index. To address this challenge, I present a bootstrap approximation method for determining the rejection region of this test. Simulation studies of artificial data are conducted to assess the finite sample performance, which shows that our method is better than the kernel method in all listed cases. The analysis of real data also demonstrates the excellent performance of this method.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Xue Ding

In this paper, we consider the limit properties of the largest entries of sample covariance matrices and the sample correlation matrices. In order to make the statistics based on the largest entries of the sample covariance matrices and the sample correlation matrices more applicable in high-dimensional tests, the identically distributed assumption of population is removed. Under some moment’s assumption of the underlying distribution, we obtain that the almost surely limit and asymptotical distribution of the extreme statistics as both the dimension p and sample size n tend to infinity.


2014 ◽  
Vol 31 (5) ◽  
pp. 953-980 ◽  
Author(s):  
Zongwu Cai ◽  
Yunfei Wang ◽  
Yonggang Wang

It is well known that allowing the coefficients to be time-varying in a predictive model with possibly nonstationary regressors can help to deal with instability in predictability associated with linear predictive models. In this paper, an L2-type test statistic is proposed to test the stability of the coefficient vector, and the asymptotic distributions of the proposed test statistic are developed under both null and alternative hypotheses. A Monte Carlo experiment is conducted to evaluate the finite sample performance of the proposed test statistic and an empirical example is examined to demonstrate the practical application of the proposed testing method.


Biometrika ◽  
2019 ◽  
Vol 106 (3) ◽  
pp. 619-634 ◽  
Author(s):  
Ping-Shou Zhong ◽  
Runze Li ◽  
Shawn Santo

Summary This paper deals with the detection and identification of changepoints among covariances of high-dimensional longitudinal data, where the number of features is greater than both the sample size and the number of repeated measurements. The proposed methods are applicable under general temporal-spatial dependence. A new test statistic is introduced for changepoint detection, and its asymptotic distribution is established. If a changepoint is detected, an estimate of the location is provided. The rate of convergence of the estimator is shown to depend on the data dimension, sample size, and signal-to-noise ratio. Binary segmentation is used to estimate the locations of possibly multiple changepoints, and the corresponding estimator is shown to be consistent under mild conditions. Simulation studies provide the empirical size and power of the proposed test and the accuracy of the changepoint estimator. An application to a time-course microarray dataset identifies gene sets with significant gene interaction changes over time.


2015 ◽  
Vol 04 (04) ◽  
pp. 1550019 ◽  
Author(s):  
Edgar Dobriban

Models from random matrix theory (RMT) are increasingly used to gain insights into the behavior of statistical methods under high-dimensional asymptotics. However, the applicability of the framework is limited by numerical problems. Consider the usual model of multivariate statistics where the data is a sample from a multivariate distribution with a given covariance matrix. Under high-dimensional asymptotics, there is a deterministic map from the distribution of eigenvalues of the population covariance matrix (the population spectral distribution or PSD), to the of empirical spectral distribution (ESD). The current methods for computing this map are inefficient, and this limits the applicability of the theory. We propose a new method to compute numerically the ESD from an arbitrary input PSD. Our method, called SPECTRODE, finds the support and the density of the ESD to high precision; we prove this for finite discrete distributions. In computational experiments SPECTRODE outperforms existing methods by orders of magnitude in speed and accuracy. We apply it to compute expectations and contour integrals of the ESD, which are often central in applications. We also illustrate that SPECTRODE is directly useful in statistical problems, such as estimation and hypothesis testing for covariance matrices. Our proposal, implemented in open source software, may broaden the use of RMT in high-dimensional data analysis.


2018 ◽  
Vol 8 (2) ◽  
pp. 289-312
Author(s):  
Dane Taylor ◽  
Juan G Restrepo ◽  
François G Meyer

Abstract Covariance matrices are fundamental to the analysis and forecast of economic, physical and biological systems. Although the eigenvalues $\{\lambda _i\}$ and eigenvectors $\{\boldsymbol{u}_i\}$ of a covariance matrix are central to such endeavours, in practice one must inevitably approximate the covariance matrix based on data with finite sample size $n$ to obtain empirical eigenvalues $\{\tilde{\lambda }_i\}$ and eigenvectors $\{\tilde{\boldsymbol{u}}_i\}$, and therefore understanding the error so introduced is of central importance. We analyse eigenvector error $\|\boldsymbol{u}_i - \tilde{\boldsymbol{u}}_i \|^2$ while leveraging the assumption that the true covariance matrix having size $p$ is drawn from a matrix ensemble with known spectral properties—particularly, we assume the distribution of population eigenvalues weakly converges as $p\to \infty $ to a spectral density $\rho (\lambda )$ and that the spacing between population eigenvalues is similar to that for the Gaussian orthogonal ensemble. Our approach complements previous analyses of eigenvector error that require the full set of eigenvalues to be known, which can be computationally infeasible when $p$ is large. To provide a scalable approach for uncertainty quantification of eigenvector error, we consider a fixed eigenvalue $\lambda $ and approximate the distribution of the expected square error $r= \mathbb{E}\left [\| \boldsymbol{u}_i - \tilde{\boldsymbol{u}}_i \|^2\right ]$ across the matrix ensemble for all $\boldsymbol{u}_i$ associated with $\lambda _i=\lambda $. We find, for example, that for sufficiently large matrix size $p$ and sample size $n> p$, the probability density of $r$ scales as $1/nr^2$. This power-law scaling implies that the eigenvector error is extremely heterogeneous—even if $r$ is very small for most eigenvectors, it can be large for others with non-negligible probability. We support this and further results with numerical experiments.


2015 ◽  
Vol 32 (4) ◽  
pp. 988-1022 ◽  
Author(s):  
Yiguo Sun ◽  
Zongwu Cai ◽  
Qi Li

In this paper, we propose a simple nonparametric test for testing the null hypothesis of constant coefficients against nonparametric smooth coefficients in a semiparametric varying coefficient model with integrated time series. We establish the asymptotic distributions of the proposed test statistic under both null and alternative hypotheses. Moreover, we derive a central limit theorem for a degenerate second order U-statistic, which contains a mixture of stationary and nonstationary variables and is weighted locally on a stationary variable. This result is of independent interest and useful in other applications. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed test.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0253349
Author(s):  
Ana C. Guedes ◽  
Francisco Cribari-Neto ◽  
Patrícia L. Espinheira

Beta regressions are commonly used with responses that assume values in the standard unit interval, such as rates, proportions and concentration indices. Hypothesis testing inferences on the model parameters are typically performed using the likelihood ratio test. It delivers accurate inferences when the sample size is large, but can otherwise lead to unreliable conclusions. It is thus important to develop alternative tests with superior finite sample behavior. We derive the Bartlett correction to the likelihood ratio test under the more general formulation of the beta regression model, i.e. under varying precision. The model contains two submodels, one for the mean response and a separate one for the precision parameter. Our interest lies in performing testing inferences on the parameters that index both submodels. We use three Bartlett-corrected likelihood ratio test statistics that are expected to yield superior performance when the sample size is small. We present Monte Carlo simulation evidence on the finite sample behavior of the Bartlett-corrected tests relative to the standard likelihood ratio test and to two improved tests that are based on an alternative approach. The numerical evidence shows that one of the Bartlett-corrected typically delivers accurate inferences even when the sample is quite small. An empirical application related to behavioral biometrics is presented and discussed.


Sign in / Sign up

Export Citation Format

Share Document