Interpoint distance tests for high-dimensional comparison studies

2019 ◽  
Vol 47 (4) ◽  
pp. 653-665 ◽  
Author(s):  
Marco Marozzi ◽  
Amitava Mukherjee ◽  
Jan Kalina
2016 ◽  
Vol 25 (6) ◽  
pp. 2593-2610 ◽  
Author(s):  
Marco Marozzi

The multivariate location problem is addressed. The most familiar method to address the problem is the Hotelling test. When the hypothesis of normal distributions holds, the Hotelling test is optimal. Unfortunately, in practice the distributions underlying the samples are generally unknown and without assuming normality the finite sample unbiasedness of the Hotelling test is not guaranteed. Moreover, high-dimensional data are increasingly encountered when analyzing medical and biological problems, and in these situations the Hotelling test performs poorly or cannot be computed. A test that is unbiased for non-normal data, for small sample sizes as well as for two-sided alternatives and that can be computed for high-dimensional data has been recently proposed and is based on the ranks of the interpoint Euclidean distances between observations. Five modifications of this test are proposed and compared to the original test and the Hotelling test. Unbiasedness and consistency of the tests are proven and the problem of power computation is addressed. It is shown that two of the modified interpoint distance-based tests are always more powerful than the original test. Particularly, the modified test based on the Tippett criterium is suggested when the assumption of normality is not tenable and/or in case of high-dimensional data with complex dependence structure which are typical in molecular biology and medical imaging. A practical application to a case-control study where functional magnetic resonance imaging is used is discussed.


2011 ◽  
Vol 11 (3) ◽  
pp. 272
Author(s):  
Ivan Gavrilyuk ◽  
Boris Khoromskij ◽  
Eugene Tyrtyshnikov

Abstract In the recent years, multidimensional numerical simulations with tensor-structured data formats have been recognized as the basic concept for breaking the "curse of dimensionality". Modern applications of tensor methods include the challenging high-dimensional problems of material sciences, bio-science, stochastic modeling, signal processing, machine learning, and data mining, financial mathematics, etc. The guiding principle of the tensor methods is an approximation of multivariate functions and operators with some separation of variables to keep the computational process in a low parametric tensor-structured manifold. Tensors structures had been wildly used as models of data and discussed in the contexts of differential geometry, mechanics, algebraic geometry, data analysis etc. before tensor methods recently have penetrated into numerical computations. On the one hand, the existing tensor representation formats remained to be of a limited use in many high-dimensional problems because of lack of sufficiently reliable and fast software. On the other hand, for moderate dimensional problems (e.g. in "ab-initio" quantum chemistry) as well as for selected model problems of very high dimensions, the application of traditional canonical and Tucker formats in combination with the ideas of multilevel methods has led to the new efficient algorithms. The recent progress in tensor numerical methods is achieved with new representation formats now known as "tensor-train representations" and "hierarchical Tucker representations". Note that the formats themselves could have been picked up earlier in the literature on the modeling of quantum systems. Until 2009 they lived in a closed world of those quantum theory publications and never trespassed the territory of numerical analysis. The tremendous progress during the very recent years shows the new tensor tools in various applications and in the development of these tools and study of their approximation and algebraic properties. This special issue treats tensors as a base for efficient numerical algorithms in various modern applications and with special emphases on the new representation formats.


Sign in / Sign up

Export Citation Format

Share Document