scholarly journals Limit distribution of the quartet balance index for Aldous’s β ≥ 0-model

2018 ◽  
Author(s):  
Krzysztof Bartoszek

AbstractThis paper builds up on T. Martínez-Coronado, A. Mir, F. Rossello and G. Valiente’s work “A balance index for phylogenetic trees based on quartets”, introducing a new balance index for trees. We show here that this balance index, in the case of Aldous’s β ≥ 0-model, convergences weakly to a distribution that can be characterized as the fixed point of a contraction operator on a class of distributions.

2017 ◽  
Author(s):  
Krzysztof Bartoszek

AbstractIn this work we study the limit distribution of an appropriately normalized cophenetic index of the pure–birth tree conditioned onncontemporary tips. We show that this normalized phylogenetic balance index is a submartingale that converges almost surely and inL2. We link our work with studies on trees without branch lengths and show that in this case the limit distribution is a contraction–type distribution, similar to the Quicksort limit distribution. In the continuous branch case we suggest approximations to the limit distribution. We propose heuristic methods of simulating from these distributions and it may be observed that these algorithms result in reasonable tails. Therefore, we propose a way based on the quantiles of the derived distributions for hypothesis testing, whether an observed phylogenetic tree is consistent with the pure–birth process. Simulating a sample by the proposed heuristics is rapid, while exact simulation (simulating the tree and then calculating the index) is a time–consuming procedure. We conduct a power study to investigate how well the cophenetic indices detect deviations from the Yule tree and apply the methodology to empirical phylogenies.


2006 ◽  
Vol 43 (02) ◽  
pp. 377-390 ◽  
Author(s):  
Rafik Aguech ◽  
Nabil Lasmar ◽  
Hosam Mahmoud

Thetrieis a sort of digital tree. Ideally, to achieve balance, the trie should grow from an unbiased source generating keys of bits with equal likelihoods. In practice, the lack of bias is not always guaranteed. We investigate the distance between randomly selected pairs of nodes among the keys in a biased trie. This research complements that of Christophi and Mahmoud (2005); however, the results and some of the methodology are strikingly different. Analytical techniques are still useful for moments calculation. Both mean and variance are of polynomial order. It is demonstrated that the standardized distance approaches a normal limiting random variable. This is proved by the contraction method, whereby the limit distribution is shown to approach the fixed-point solution of a distributional equation in the Wasserstein metric space.


2019 ◽  
Author(s):  
Tomás Martínez Coronado ◽  
Arnau Mir ◽  
Francesc Rossello ◽  
Lucía Rotger

Abstract Background: The Sackin index S of a rooted phylogenetic tree, defined as the sum of its leaves' depths, is one of the most popular balance indices in phylogenetics, and Sackin's 1972 paper is usually cited as the source for this index. However, what Sackin actually proposed in his paper as a measure of the imbalance of a rooted tree was not the sum of its leaves' depths, but their "variation". This proposal was later implemented as the variance of the leaves' depths by Kirkpatrick and Slatkin, where moreover they posed the problem of finding a closed formula for its expected value under the Yule model. Nowadays, Sackin's original proposal seems to have passed into oblivion in the phylogenetics literature, replaced by the index bearing his name, which, in fact, was introduced a decade later by Sokal.Results: In this paper we study the properties of the variance of the leaves' depths, V, as a balance index. Firstly, we prove that the rooted trees with n leaves and maximum V value are exactly the combs with n leaves. But although V achieves its minimum value on every space BT_n of bifurcating rooted phylogenetic trees with n< 184 leaves at the so-called "maximally balanced trees" with n leaves, this property fails for almost every n>= 184. We provide then an algorithm that finds in O(n) time the trees in BT_n with minimum V value. Secondly, we obtain closed formulas for the expected V value of a bifurcating rooted tree with any number n of leaves under the Yule and the uniform models and, as a by-product of the computations leading to these formulas, we also obtain closed formulas for the variance of the Sackin index and the total cophenetic indexof a bifurcating rooted tree, as well as of their covariance, under the uniform model, thus filling this gap in the literature.Conclusions: The phylogenetics crowd has been wise in preferring as a balance index the sum S(T) of the leaves’ depths of a phylogenetic tree T over their variance V (T), because the latter does not seem to capture correctly the notion of balance of large bifurcating rooted trees. But for bifurcating trees up to 183 leaves, V is a valid and useful balance index.


2019 ◽  
Vol 79 (3) ◽  
pp. 1105-1148 ◽  
Author(s):  
Tomás M. Coronado ◽  
Arnau Mir ◽  
Francesc Rosselló ◽  
Gabriel Valiente

2020 ◽  
Author(s):  
Tomás Martínez Coronado ◽  
Arnau Mir ◽  
Francesc Rossello ◽  
Lucía Rotger

Abstract Background. The Sackin index S of a rooted phylogenetic tree, defined as the sum of its leaves' depths, is one of the most popular balance indices in phylogenetics, and Sackin's 1972 paper is usually cited as the source for this index. However, what Sackin actually proposed in his paper as a measure of the imbalance of a rooted tree was not the sum of its leaves' depths, but their ``variation''. This proposal was later implemented as the variance of the leaves' depths by Kirkpatrick and Slatkin in 1993, where they also posed the problem of finding a closed formula for its expected value under the Yule model. Nowadays, Sackin's original proposal seems to have passed into oblivion in the phylogenetics literature, replaced by the index bearing his name, which, in fact, was introduced a decade later by Sokal. Results. In this paper we study the properties of the variance of the leaves' depths, V, as a balance index. Firstly, we prove that the rooted trees with $n$ leaves and maximum V value are exactly the combs with n leaves. But although V achieves its minimum value on every space of bifurcating rooted phylogenetic trees with at most 183 leaves at the so-called ``maximally balanced trees'' with n leaves, this property fails for almost every n larger than 184 We provide then an algorithm that finds the bifurcating rooted trees with n leaves and minimum V value in quasilinear time. Secondly, we obtain closed formulas for the expected V value of a bifurcating rooted tree with any number n of leaves under the Yule and the uniform models and, as a by-product of the computations leading to these formulas, we also obtain closed formulas for the variance under the uniform model of the Sackin index and the total cophenetic index of a bifurcating rooted tree, as well as of their covariance, thus filling this gap in the literature.


2020 ◽  
Author(s):  
Tomás Martínez Coronado ◽  
Arnau Mir ◽  
Francesc Rossello ◽  
Lucía Rotger

Abstract Background. The Sackin index S of a rooted phylogenetic tree, defined as the sum of its leaves' depths, is one of the most popular balance indices in phylogenetics, and Sackin's 1972 paper is usually cited as the source for this index. However, what Sackin actually proposed in his paper as a measure of the imbalance of a rooted tree was not the sum of its leaves' depths, but their ``variation''. This proposal was later implemented as the variance of the leaves' depths by Kirkpatrick and Slatkin in 1993, where they also posed the problem of finding a closed formula for its expected value under the Yule model. Nowadays, Sackin's original proposal seems to have passed into oblivion in the phylogenetics literature, replaced by the index bearing his name, which, in fact, was introduced a decade later by Sokal. Results. In this paper we study the properties of the variance of the leaves' depths, V, as a balance index. Firstly, we prove that the rooted trees with $n$ leaves and maximum V value are exactly the combs with n leaves. But although V achieves its minimum value on every space of bifurcating rooted phylogenetic trees with at most 183 leaves at the so-called ``maximally balanced trees'' with n leaves, this property fails for almost every n larger than 184 We provide then an algorithm that finds the bifurcating rooted trees with n leaves and minimum V value in quasilinear time. Secondly, we obtain closed formulas for the expected V value of a bifurcating rooted tree with any number n of leaves under the Yule and the uniform models and, as a by-product of the computations leading to these formulas, we also obtain closed formulas for the variance under the uniform model of the Sackin index and the total cophenetic index of a bifurcating rooted tree, as well as of their covariance, thus filling this gap in the literature.


2006 ◽  
Vol 43 (2) ◽  
pp. 377-390 ◽  
Author(s):  
Rafik Aguech ◽  
Nabil Lasmar ◽  
Hosam Mahmoud

The trie is a sort of digital tree. Ideally, to achieve balance, the trie should grow from an unbiased source generating keys of bits with equal likelihoods. In practice, the lack of bias is not always guaranteed. We investigate the distance between randomly selected pairs of nodes among the keys in a biased trie. This research complements that of Christophi and Mahmoud (2005); however, the results and some of the methodology are strikingly different. Analytical techniques are still useful for moments calculation. Both mean and variance are of polynomial order. It is demonstrated that the standardized distance approaches a normal limiting random variable. This is proved by the contraction method, whereby the limit distribution is shown to approach the fixed-point solution of a distributional equation in the Wasserstein metric space.


2013 ◽  
Vol 241 (1) ◽  
pp. 125-136 ◽  
Author(s):  
Arnau Mir ◽  
Francesc Rosselló ◽  
Lucı´a Rotger

Sign in / Sign up

Export Citation Format

Share Document