tree inference
Recently Published Documents


TOTAL DOCUMENTS

200
(FIVE YEARS 102)

H-INDEX

23
(FIVE YEARS 4)

2022 ◽  
Author(s):  
XiaoXu Pang ◽  
Da-Yong Zhang

The species studied in any evolutionary investigation generally constitute a very small proportion of all the species currently existing or that have gone extinct. It is therefore likely that introgression, which is widespread across the tree of life, involves "ghosts," i.e., unsampled, unknown, or extinct lineages. However, the impact of ghost introgression on estimations of species trees has been rarely studied and is thus poorly understood. In this study, we use mathematical analysis and simulations to examine the robustness of species tree methods based on a multispecies coalescent model under gene flow sourcing from an extant or ghost lineage. We found that very low levels of extant or ghost introgression can result in anomalous gene trees (AGTs) on three-taxon rooted trees if accompanied by strong incomplete lineage sorting (ILS). In contrast, even massive introgression, with more than half of the recipient genome descending from the donor lineage, may not necessarily lead to AGTs. In cases involving an ingroup lineage (defined as one that diverged no earlier than the most basal species under investigation) acting as the donor of introgression, the time of root divergence among the investigated species was either underestimated or remained unaffected, but for the cases of outgroup ghost lineages acting as donors, the divergence time was generally overestimated. Under many conditions of ingroup introgression, the stronger the ILS was, the higher was the accuracy of estimating the time of root divergence, although the topology of the species tree is more prone to be biased by the effect of introgression.


Biosystems ◽  
2022 ◽  
pp. 104606
Author(s):  
Manuel Villalobos-Cid ◽  
César Rivera ◽  
Eduardo I. Kessi-Pérez ◽  
Mario Inostroza-Ponta

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12691
Author(s):  
Jiajia Wang ◽  
Yu Bai ◽  
Haifeng Zhao ◽  
Ruinan Mu ◽  
Yan Dong

Background There have been extensive debates on the interrelationships among the four major classes of Myriapoda—Chilopoda, Symphyla, Diplopoda, and Pauropoda. The core controversy is the position of Pauropoda; that is, whether it should be grouped with Symphyla or Diplopoda as a sister group. Two recent phylogenomic studies separately investigated transcriptomic data from 14 and 29 Myriapoda species covering all four groups along with outgroups, and proposed two different topologies of phylogenetic relationships. Methods Building on these studies, we extended the taxon sampling by investigating 39 myriapods and integrating the previously available data with three new transcriptomic datasets generated in this study. Our analyses present the phylogenetic relationships among the four major classes of Myriapoda with a more abundant taxon sampling and provide a new perspective to investigate the above-mentioned question, where visual genes’ identification were conducted. We compared the appearance pattern of genes, grouping them according to their classes and the visual pathways involved. Positive selection was detected for all identified visual genes between every pair of 39 myriapods, and 14 genes showed positive selection among 27 pairs. Results From the results of phylogenomic analyses, we propose that Symphyla is a sister group of Pauropoda. This stance has also received strong support from tree inference and topology tests.


2021 ◽  
Author(s):  
Baqiao Liu ◽  
Tandy Warnow

Species tree inference under the multi-species coalescent (MSC) model is a basic step in biological discovery. Despite the developments in recent years of methods that are proven statistically consistent and that have high accuracy, large datasets create computational challenges. Although there is gener- ally some information available about the species trees that could be used to speed up the estimation, only one method, ASTRAL-J, a recent development in the ASTRAL family of methods, is able to use this information. Here we describe two new methods, NJst-J and FASTRAL-J, that can estimate the species tree given partial knowledge of the species tree in the form of a non-binary unrooted constraint tree.. We show that both NJst-J and FASTRAL-J are much faster than ASTRAL-J and we prove that all three methods are statistically consistent under the multi-species coalescent model subject to this constraint. Our extensive simulation study shows that both FASTRAL-J and NJst-J provide advantages over ASTRAL-J: both are faster (and NJst-J is particularly fast), and FASTRAL-J is generally at least as accurate as ASTRAL-J. An analysis of the Avian Phylogenomics project dataset with 48 species and 14,446 genes presents additional evidence of the value of FASTRAL-J over ASTRAL-J (and both over ASTRAL), with dramatic reductions in running time (20 hours for default ASTRAL, and minutes or seconds for ASTRAL-J and FASTRAL-J, respectively). Availability: FASTRAL-J and NJst-J are available in open source form at https://github.com/ RuneBlaze/FASTRAL-constrained and https://github.com/RuneBlaze/NJst-constrained. Locations of the datasets used in this study and detailed commands needed to reproduce the study are provided in the supplementary materials at http://tandy.cs.illinois.edu/baqiao-suppl.pdf.


Author(s):  
Philipp Hühn ◽  
Markus S. Dillenberger ◽  
Michael Gerschwitz-Eidt ◽  
Elvira Hörandl ◽  
Jessica A. Los ◽  
...  

2021 ◽  
Author(s):  
Vishnu Raja Vijayakumar ◽  
Karthikeyan Saravanan ◽  
Maharaja Somasundaram ◽  
Rajkumar Jayaraj ◽  
Panneerselvam Annamalai ◽  
...  

Abstract A lichen is a composite organism formed of algae or cyanobacteria that live in a mutually advantageous symbiotic relationship with the filaments (hyphae) of fungus. Three lichen samples were obtained from diverse sites at Kuppanasamy temple, Pollachi, a terrestrial habitat located in Coimbatore, Nithiravilai, Nagercoil, and Ramarpatham, Vedaranyam, both coastal habitats located in Kanyakumari and Nagapattinam districts of Tamil Nadu, for this study. Amplification and sequencing of 16S rRNA V3 - V4 regions were used for metagenomic study. Aside from the NGS data, distinct types of lichen microbiome profiles were clearly shown. The bacterial diversity in the lichen genera of Roccella montagnei growing in coastal and terrestrial environments was further investigated using common and unique operational taxonomic units (OTUs) and the QIIME pipeline (1.9.1). By using similarity clustering, the heat map analysis depicts the abundance information of chosen OTUs as well as the similarity and difference between OTUs and lichen samples. Using multiple methods, the Alpha and Beta diversity analysis revealed that there were differences in all of the samples. However, UPGMA tree inference of comparable bacterial community in coastal habitat lichen samples compared to terrestrial habitat validates their evolutionary lineage. As a result, the bacterial population associated with corticolous lichen is dependent on geographic locations, growth substrate, and climatic circumstances of similar lichen genera produced in different habitats and tree substrates.


2021 ◽  
Author(s):  
Jordan Douglas ◽  
Cinthy L. Jiménez-Silva ◽  
Remco Bouckaert

AbstractAs genomic sequence data becomes increasingly available, inferring the phylogeny of the species as that of concatenated genomic data can be enticing. However, this approach makes for a biased estimator of branch lengths and substitution rates and an inconsistent estimator of tree topology. Bayesian multispecies coalescent methods address these issues. This is achieved by embedding a set of gene trees within a species tree and jointly inferring both under a Bayesian framework. However, this approach comes at the cost of increased computational demand. Here, we introduce StarBeast3 – a software package for efficient Bayesian inference of the multispecies coalescent model via Markov chain Monte Carlo. We gain efficiency by introducing cutting-edge proposal kernels and adaptive operators, and StarBeast3 is particularly efficient when a relaxed clock model is applied. Furthermore, gene tree inference is parallelised, allowing the software to scale with the size of the problem. We validated our software and benchmarked its performance using three real and two synthetic datasets. Our results indicate that StarBeast3 is up to one-and-a-half orders of magnitude faster than StarBeast2, and therefore more than two orders faster than *BEAST, depending on the dataset and on the parameter, and is suitable for multispecies coalescent inference on large datasets (100+ genes). StarBeast3 is open-source and is easy to set up with a friendly graphical user interface.


2021 ◽  
Author(s):  
Jūlija Pečerska ◽  
Manuel Gil ◽  
Maria Anisimova

Multiple sequence alignment and phylogenetic tree inference are connected problems that are often solved as independent steps in the inference process. Several attempts at doing simultaneous inference have been made, however currently the available methods are greatly limited by their computational complexity and can only handle small datasets. In this manuscript we introduce a combinatorial optimisation approach that will allow us to resolve the circularity of the problem and efficiently infer both alignments and trees under maximum likelihood.


2021 ◽  
Author(s):  
Megan L Smith ◽  
Dan Vanderpool ◽  
Matthew W. Hahn

Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs by using clustering approaches and retaining families with a single sequence from each species. However, this approach can severely limit the amount of data available by excluding larger families. Recent methodological advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction of orthologs from large families. Additionally, several popular methods for species tree inference appear to be robust to the inclusion of paralogs, and hence could use all of the data from larger families. Here, we explore the effects of using all families for phylogenetic inference using genomes from 26 primate species. We compare single-copy families, orthologs extracted using tree-based decomposition approaches, and all families with all data (i.e., including orthologs and paralogs). We explore several species tree inference methods, finding that across all nodes of the tree except one, identical trees are returned across nearly all datasets and methods. As in previous studies, the relationships among Platyrrhini remain contentious; however, the tree inference methods matter more than the dataset used. We also assess the effects of each dataset on branch length estimates, measures of phylogenetic uncertainty and concordance, and in detecting introgression. Our results demonstrate that using data from larger gene families drastically increases the number of genes available for phylogenetic inference and leads to consistent estimates of branch lengths, nodal certainty and concordance, and inferences of introgression.


2021 ◽  
Author(s):  
Sergey Bocharov ◽  
Simon Harris ◽  
Emma Kominek ◽  
Arne O Mooers ◽  
Mike Steel

In the simplest phylodynamic model (the pure-birth Yule process), lineages split independently at a constant rate λ for time t. The length of a randomly chosen edge (either interior or pendant) in the resulting tree has an expected value that rapidly converges to 1/(2λ) as t grows, and thus is essentially independent of t. However, the behaviour of the length L of the longest pendant edge reveals remarkably different behaviour: L converges to t/2 as the expected number of leaves grows. Extending this model to allow an extinction rate μ (where μ<λ), we also establish a similar result for birth--death trees, except that t/2 is replaced by t/2 x (1-μ/λ). This 'complete' tree may contain subtrees that have died out before time t; for the 'reduced tree' that just involves the leaves present at time t and their direct ancestors, the longest pendant edge length L again converges to t/2. Thus, there is likely to be at least one extant species whose associated pendant branch attaches to the tree approximately half-way back in time to the origin of the entire clade. We also briefly consider the length of the shortest edges. Our results are relevant to phylogenetic diversity indices in biodiversity conservation, and to questions concerning the length of aligned sequences required to correctly infer a tree. We compare our theoretical results with simulations, and with the branch lengths from a recent phylogenetic tree of all mammals.


Sign in / Sign up

Export Citation Format

Share Document