Random Permutations, Non-Decreasing Subsequences and Statistical Independence

In this paper, we show how the longest non-decreasing subsequence, identified in the graph of the paired marginal ranks of the observations, allows the construction of a statistic for the development of an independence test in bivariate vectors. The test works in the case of discrete and continuous data. Since the present procedure does not require the continuity of the variables, it expands the proposal introduced in Independence tests for continuous random variables based on the longest increasing subsequence (2014). We show the efficiency of the procedure in detecting dependence in real cases and through simulations.

Download Full-text

Independence tests for continuous random variables based on the longest increasing subsequence

Journal of Multivariate Analysis ◽

10.1016/j.jmva.2014.02.010 ◽

2014 ◽

Vol 127 ◽

pp. 126-146 ◽

Cited By ~ 9

Author(s):

Jesús E. García ◽

V.A. González-López

Keyword(s):

Random Variables ◽

Independence Tests ◽

Longest Increasing Subsequence

Download Full-text

On statistical independence and zero correlation in several dimensions

Journal of the Australian Mathematical Society ◽

10.1017/s1446788700026288 ◽

1960 ◽

Vol 1 (4) ◽

pp. 492-496 ◽

Cited By ~ 4

Author(s):

H. O. Lancaster

Keyword(s):

Canonical Form ◽

Arbitrary Number ◽

Random Variables ◽

Summable Function ◽

Statistical Independence ◽

Bivariate Distributions ◽

Canonical Correlations ◽

Zero Correlation ◽

Maximal Correlation

Bivariate distributions, subject to a condition of φ2 boundedness to be defined later, can be written in a canonical form. Sarmanov [4] used such a form to deduce that two random variables are independent if and only if the maximal correlation of any square summable function, ξ (x1), of the first variable with any square summable function, η(x2), of the second variable is zero. This is equivalent to the condition that the canonical correlations are all zero. The theorem of Sarmanov [4] was proved without any restriction in Lancaster [2] and the proof is now extended to an arbitrary number of dimensions.

Download Full-text

On Some Densities in the Set of Permutations

The Electronic Journal of Combinatorics ◽

10.37236/372 ◽

2010 ◽

Vol 17 (1) ◽

Author(s):

Eugenijus Manstavičius

Keyword(s):

Saddle Point ◽

Cycle Length ◽

Random Variables ◽

Point Method ◽

Saddle Point Method ◽

Asymptotic Density ◽

Independent Random Variables ◽

Random Permutations ◽

Set Of Permutations

The asymptotic density of random permutations with given properties of the $k$th shortest cycle length is examined. The approach is based upon the saddle point method applied for appropriate sums of independent random variables.

Download Full-text

Efficient Markov Network Structure Discovery Using Independence Tests

Journal of Artificial Intelligence Research ◽

10.1613/jair.2773 ◽

2009 ◽

Vol 35 ◽

pp. 449-484 ◽

Cited By ~ 17

Author(s):

F. Bromberg ◽

D. Margaritis ◽

V. Honavar

Keyword(s):

Conditional Independence ◽

Structure Learning ◽

Statistical Tests ◽

Likelihood Estimation ◽

Data Sets ◽

Statistical Independence ◽

Real World Data ◽

Independence Tests ◽

Markov Network ◽

Experimental Comparisons

We present two algorithms for learning the structure of a Markov network from data: GSMN* and GSIMN. Both algorithms use statistical independence tests to infer the structure by successively constraining the set of structures consistent with the results of these tests. Until very recently, algorithms for structure learning were based on maximum likelihood estimation, which has been proved to be NP-hard for Markov networks due to the difficulty of estimating the parameters of the network, needed for the computation of the data likelihood. The independence-based approach does not require the computation of the likelihood, and thus both GSMN* and GSIMN can compute the structure efficiently (as shown in our experiments). GSMN* is an adaptation of the Grow-Shrink algorithm of Margaritis and Thrun for learning the structure of Bayesian networks. GSIMN extends GSMN* by additionally exploiting Pearl's well-known properties of the conditional independence relation to infer novel independences from known ones, thus avoiding the performance of statistical tests to estimate them. To accomplish this efficiently GSIMN uses the Triangle theorem, also introduced in this work, which is a simplified version of the set of Markov axioms. Experimental comparisons on artificial and real-world data sets show GSIMN can yield significant savings with respect to GSMN*, while generating a Markov network with comparable or in some cases improved quality. We also compare GSIMN to a forward-chaining implementation, called GSIMN-FCH, that produces all possible conditional independences resulting from repeatedly applying Pearl's theorems on the known conditional independence tests. The results of this comparison show that GSIMN, by the sole use of the Triangle theorem, is nearly optimal in terms of the set of independences tests that it infers.

Download Full-text

Records, permutations and greatest convex minorants

Mathematical Proceedings of the Cambridge Philosophical Society ◽

10.1017/s0305004100068067 ◽

1989 ◽

Vol 106 (1) ◽

pp. 169-177 ◽

Cited By ~ 18

Author(s):

Charles M. Goldie

Keyword(s):

Random Walk ◽

Random Variables ◽

Random Permutations ◽

Record Times ◽

Distribution Free ◽

Standard Representation ◽

Bernoulli Random Variables ◽

Convex Minorants ◽

Probability Spaces

AbstractTheorems on random permutations are translated into distribution-free results about record times and greatest convex minorants, by defining them together on appropriate probability spaces. The Bernoulli random variables that appear in the standard representation of the number of sides of the greatest convex minorant of a random walk are identified.

Download Full-text

The Distribution of the Length of the Longest Increasing Subsequence in Random Permutations of Arbitrary Multi-sets

Methodology And Computing In Applied Probability ◽

10.1007/s11009-019-09753-1 ◽

2019 ◽

Vol 22 (3) ◽

pp. 1009-1021

Author(s):

Ayat Al-Meanazel ◽

Brad C. Johnson

Keyword(s):

Random Permutations ◽

Longest Increasing Subsequence

Download Full-text

An Improved Non-parametric Bayesian Independence Test for Probabilistic Learning of the Dependence Structure Among Continuous Random Variables

KSCE Journal of Civil Engineering ◽

10.1007/s12205-018-1398-3 ◽

2018 ◽

Vol 22 (3) ◽

pp. 974-986

Author(s):

Ji-Eun Byun ◽

Junho Song ◽

Kilian Zwirglmaier ◽

Daniel Straub

Keyword(s):

Random Variables ◽

Dependence Structure ◽

Probabilistic Learning ◽

Independence Test ◽

Non Parametric

Download Full-text

On statistical independence and no-correlation for a pair of random variables taking two values: Classical and quantum

Progress of Theoretical and Experimental Physics ◽

10.1093/ptep/pty086 ◽

2018 ◽

Vol 2018 (8) ◽

Cited By ~ 1

Author(s):

Toru Ohira

Keyword(s):

Random Variables ◽

Statistical Independence

Download Full-text

Testing the statistical independence of continuous random variables a new robust algorithm

International Symposium on Signals, Circuits and Systems, 2005. ISSCS 2005. ◽

10.1109/isscs.2005.1509881 ◽

2006 ◽

Author(s):

B. Badea ◽

A. Vlad

Keyword(s):

Random Variables ◽

Statistical Independence ◽

Robust Algorithm

Download Full-text

Jackknife approach to the estimation of mutual information

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1715593115 ◽

2018 ◽

Vol 115 (40) ◽

pp. 9956-9961 ◽

Cited By ~ 5

Author(s):

Xianli Zeng ◽

Yingcun Xia ◽

Howell Tong

Keyword(s):

Data Analysis ◽

Mutual Information ◽

Random Variables ◽

Kernel Estimation ◽

Continuous Data ◽

Fundamental Issue ◽

Kernel Estimate ◽

Kernel Estimates ◽

Unresolved Problem

Quantifying the dependence between two random variables is a fundamental issue in data analysis, and thus many measures have been proposed. Recent studies have focused on the renowned mutual information (MI) [Reshef DN, et al. (2011)Science334:1518–1524]. However, “Unfortunately, reliably estimating mutual information from finite continuous data remains a significant and unresolved problem” [Kinney JB, Atwal GS (2014)Proc Natl Acad Sci USA111:3354–3359]. In this paper, we examine the kernel estimation of MI and show that the bandwidths involved should be equalized. We consider a jackknife version of the kernel estimate with equalized bandwidth and allow the bandwidth to vary over an interval. We estimate the MI by the largest value among these kernel estimates and establish the associated theoretical underpinnings.

Download Full-text